Artificial intelligence (AI) and machine learning (ML) seem to have piqued the interest of automated data collection providers. While web scraping has been around for some time, AI/ML implementations have appeared in the line of sight of providers only recently.
Aleksandras ulenko, Product Owner at Oxylabs.io, who has been working with these solutions for several years, shares his insights on the importance of artificial intelligence, machine learning, and web scraping.
BN: How has the implementation of AI/ML solutions changed the way you approach development?
AS: AI/ML has an interesting work-payoff ratio. Good models can sometimes take months to write and develop. Until then, you dont really have anything. Dedicated scrapers or parsers, on the other hand, can take up to a day or two. When you have an ML model, however, maintaining it takes a lot less time for the amount of work it covers.
So, theres always a choice. You can build dedicated scrapers and parsers, which will take significant amounts of time and effort to maintain once they start stacking up. The other choice is to have "nothing" for a significant amount of time, but a brilliant solution later on, which will save you tons of time and effort.
Theres some theoretical point where developing custom solutions is no longer worth it. Unfortunately, theres no mathematical formula to arrive at the correct answer. You have to make a decision when all the repetitive tasks are just too much of a hog on resources.
BN: Have these solutions had a visible impact on the deliverability and overall viability of the project?
AS: Getting started with machine learning is tough, though. Its still, comparatively speaking, a niche specialization. In other words, you wont find many developers that dabble in ML, and knowing how hard it can be to find one for any discipline, its definitely a tough river to cross.
Yet, if the business approach to scraping is based on a long-term vision, ML will definitely come in handy sometime down the road. Every good vision has scaling in it and with scaling comes repetitive tasks. These are best handled with machine learning.
Our awesome achievement we call Adaptive Parser is a great example. It was once almost unthinkable that a machine learning model could be of such high benefit. Now the solution can deliver parsed results from a multitude of e-commerce product pages, irrespective of the changes between them or any that happen over time. Such a solution is completely irreplaceable.
BN: In a previous interview, youve mentioned the importance of making things more user-friendly for web scraping solutions. Is there any particular reason you would recommend moving development towards no-code implementations?
AS: Even companies that have large IT departments may have issues with integration. Developers are almost always busy. Taking time out of their schedules for integration purposes is tough. Most end-users of the data Scraper APIs, after all, arent tech-savvy.
Additionally, the departments that would need scraping the most such as marketing, data analytics, etc., might not have enough sway in deciding the roadmaps of developers. As such, even relatively small hurdles can become impactful enough. Scrapers should now be developed with a non-tech user in mind.
There should be plenty of visuals that allow for a simplified construction of workflows with a dashboard thats used to deliver information clearly. Scraping is becoming something done by everyone.
BN: What do you think lies in the future of scraping? Will websites become increasingly protective of their data, or will they eventually forego most anti-scraping sentiment?
AS: There are two of the answers I can give. One is "more of the same". Surely, a boring one, but its inevitable. Delving deeper into scaling and proliferation of web scraping isnt as fun as the next question -- the legal context.
Currently, it seems as if our position in the industry isnt perfectly decided. Case law forms the basis of how we think and approach web scraping. Yet, it all might change on a whim. Were closely monitoring the developments due to the inherent fragility of the situation.
Theres a possibility that companies will realize the value of their data and start selling it on third-party marketplaces. It would reduce the value of web scraping as a whole as you could simply acquire what you need for a small price. Most businesses, after all, need the data and the insights, not web scraping. Its a means to an end.
Theres a lot of potential in the grand vision of Web 3.0 -- the initiative to make the whole Web interconnected and machine-readable. If this vision came to life, the whole data gathering landscape would be vastly transformed: the Web would become much easier to explore and organize, parsing would become a thing of the past, and webmasters would get used to the idea of their data being consumed by non-human actors.
Finally, I think user-friendliness will be the focus in the future. I dont mean just the no-code part of scraping. A large part of getting data is exploration -- finding where and how its stored and getting to it. Customers will often formulate an abstract request and developers will follow up with methods to acquire what is needed.
In the future, I expect, the exploration phase will be much simpler. Maybe well be able to take the abstract requests and turn them into something actionable through an interface. In the end, web scraping is breaking away from its shell of being something code-ridden or hard to understand and evolving into a daily activity for everyone.
Photo Credit: Photon photo/Shutterstock
Read more:
Tying Artificial intelligence and web scraping together [Q&A] - BetaNews
- What is Artificial Intelligence? How Does AI Work? | Built In [Last Updated On: September 5th, 2019] [Originally Added On: September 5th, 2019]
- Artificial Intelligence What it is and why it matters | SAS [Last Updated On: September 5th, 2019] [Originally Added On: September 5th, 2019]
- artificial intelligence | Definition, Examples, and ... [Last Updated On: September 5th, 2019] [Originally Added On: September 5th, 2019]
- Benefits & Risks of Artificial Intelligence - Future of ... [Last Updated On: September 5th, 2019] [Originally Added On: September 5th, 2019]
- What is AI (artificial intelligence)? - Definition from ... [Last Updated On: September 11th, 2019] [Originally Added On: September 11th, 2019]
- What is Artificial Intelligence (AI)? ... - Techopedia [Last Updated On: September 13th, 2019] [Originally Added On: September 13th, 2019]
- 9 Powerful Examples of Artificial Intelligence in Use ... [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
- What's the Difference Between Robotics and Artificial ... [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
- The Impact of Artificial Intelligence - Widespread Job Losses [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
- Artificial Intelligence & the Pharma Industry: What's Next ... [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
- Artificial Intelligence | GE Research [Last Updated On: September 18th, 2019] [Originally Added On: September 18th, 2019]
- A.I. Artificial Intelligence (2001) - IMDb [Last Updated On: October 5th, 2019] [Originally Added On: October 5th, 2019]
- 10 Best Artificial Intelligence Course & Certification [2019 ... [Last Updated On: October 15th, 2019] [Originally Added On: October 15th, 2019]
- Artificial Intelligence in Healthcare: the future is amazing ... [Last Updated On: October 15th, 2019] [Originally Added On: October 15th, 2019]
- Will Artificial Intelligence Help Resolve the Food Crisis? - Inter Press Service [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- Two-thirds of employees would trust a robot boss more than a real one - World Economic Forum [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- UofL partners with industry experts to launch Artificial Intelligence Innovation Consortium Lane Report | Kentucky Business & Economic News - The... [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- China Sees Surge of Edtech Investments With Focus on Artificial Intelligence - Karma [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- NIST researchers use artificial intelligence for quality control of stem cell-derived tissues - National Institutes of Health [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- Indiana University Touts Big Red 200 and Artificial Intelligence at SC19 - HPCwire [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- One way for the Pentagon to prove it's serious about artificial intelligence - C4ISRNet [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- Artificial Intelligence Will Enable the Future, Blockchain Will Secure It - Cointelegraph [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- Artificial intelligence has become a driving force in everyday life, says LivePerson CEO - CNBC [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- 4 Reasons to Use Artificial Intelligence in Your Next Embedded Design - DesignNews [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- Artificial Intelligence Essay - 966 Words | Bartleby [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- AI News: Track The Latest Artificial Intelligence Trends And ... [Last Updated On: November 18th, 2019] [Originally Added On: November 18th, 2019]
- AI in contact centres: It's time to stop talking about artificial intelligence - Verdict [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Newsrooms have five years to embrace artificial intelligence or they risk becoming irrelevant - Journalism.co.uk [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Scientists used IBM Watson to discover an ancient humanoid stick figure - Business Insider [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- The Mark Foundation Funds Eight Projects at the Intersection of Artificial Intelligence and Cancer Research - BioSpace [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Colorado at the forefront of AI and what it means for jobs of the future - The Denver Channel [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Highlights: Addressing fairness in the context of artificial intelligence - Brookings Institution [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Artificial intelligence won't kill journalism or save it, but the sooner newsrooms buy in, the better - Nieman Journalism Lab at Harvard [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- How To Get Your Rsum Past The Artificial Intelligence Gatekeepers - Forbes [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Epiq expands company-wide initiative to accelerate the deployment of artificial intelligence for clients globally - GlobeNewswire [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Preparing the Military for a Role on an Artificial Intelligence Battlefield - The National Interest Online [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Podcast decodes ethics in artificial intelligence and its relevance to public - Daily Bruin [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Global Military Artificial Intelligence (AI) and Cybernetics Market Report, 2019-2024: Focus on Platforms, Technologies, Applications and Services -... [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Artificial intelligence warning: Development of AI is comparable to nuclear bomb - Express.co.uk [Last Updated On: November 20th, 2019] [Originally Added On: November 20th, 2019]
- Google's new study reveals 'Artificial Intelligence benefiting journalism' - Digital Information World [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- Artificial Intelligence (AI) in Retail Market worth $15.3 billion by 2025 - Exclusive Report by Meticulous Research - GlobeNewswire [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- With artificial intelligence to a better wood product - Newswise [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- Report to Congress on Artificial Intelligence and National Security - USNI News [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- Most plastic is not getting recycled, and AI robots could be a solution - Business Insider [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- Fujifilm Showcases Artificial Intelligence Initiative And Advances AI - AiThority [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- Artificial intelligence could be one of the most valuable tools mankind has built - here's one small but meani - Business Insider India [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- Artificial Intelligence: A Need of Modern 'Intelligent' Education - Thrive Global [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- Drones And Artificial Intelligence Help Combat The San Francisco Bays Trash Problem - Forbes [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- DesignCon Expands Into Artificial Intelligence, Automotive, 5G, IoT, and More For 2020 Edition - I-Connect007 [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- Is St. Louis ready for artificial intelligence? It will steal white-collar jobs here, too - STLtoday.com [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- IT chiefs recognise the risks of artificial intelligence bias - ComputerWeekly.com [Last Updated On: November 23rd, 2019] [Originally Added On: November 23rd, 2019]
- PNNL researchers working to improve doctor-patient care through artificial intelligence - NBC Right Now [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
- How Augmented Reality and Artificial Intelligence Are Helping Entrepreneurs Create a Better Customer Experience - CTOvision [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
- Manufacturing Leaders' Summit: Realising the promise of Artificial Intelligence - Manufacturer.com [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
- 2019 Artificial Intelligence in Precision Health - Dedication to Discuss & Analyze AI Products Related to Precision Healthcare Already Available -... [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
- Artificial intelligence will affect Salt Lake, Ogden more than most areas in the nation, study shows - KSL.com [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
- It Pays To Break Artificial Intelligence Out Of The Lab, Study Confirms - Forbes [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
- The Best Artificial Intelligence Stocks of 2019 -- and The Top AI Stock for 2020 - The Motley Fool [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
- Artificial Intelligence of Things (AIoT) Market Research Report 2019-2024 - Embedded AI in Support of IoT Things/Objects Will Reach $4.6B Globally by... [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
- How Augmented Reality and Artificial Intelligence Are Helping Entrepreneurs Create a Better Customer Experience - Entrepreneur [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
- SC Proposes Introduction Of Artificial Intelligence In Justice Delivery System - Inc42 Media [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
- Artificial intelligence in FX 'may be hype' - FX Week [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
- Fujifilm Showcases Artificial Intelligence Initiative And Advances at RSNA 2019 - Imaging Technology News [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
- The Surprising Way Artificial Intelligence Is Transforming Transportation - Forbes [Last Updated On: December 1st, 2019] [Originally Added On: December 1st, 2019]
- Artificial Intelligence in 2020: The Architecture and the Infrastructure - Gigaom [Last Updated On: December 2nd, 2019] [Originally Added On: December 2nd, 2019]
- AI IN BANKING: Artificial intelligence could be a near $450 billion opportunity for banks - here are the strat - Business Insider India [Last Updated On: December 2nd, 2019] [Originally Added On: December 2nd, 2019]
- The impact of artificial intelligence on humans - Bangkok Post [Last Updated On: December 2nd, 2019] [Originally Added On: December 2nd, 2019]
- Should the EU embrace artificial intelligence, or fear it? - EURACTIV [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- BioSig Technologies Announces New Collaboration on Development of Artificial Intelligence Solutions in Healthcare - GlobeNewswire [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- Artificial intelligence-based fitness is promising but may not be for everyone - Livemint [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- Pondering the Ethics of Artificial Intelligence in Health Care Kansas City Experts Team Up on Emerging - Flatland [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- Baidu Leads the Way in Innovation with 5712 Artificial Intelligence Patent Applications - GlobeNewswire [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- Artificial Intelligence and National Security, and More from CRS - Secrecy News [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- Longer Looks: The Psychology Of Voting; Overexcited Neurons And Artificial Intelligence; And More - Kaiser Health News [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- Emotion Artificial Intelligence Market Business Opportunities and Forecast from 2019-2025 | Eyesight Technologies, Affectiva - The Connect Report [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- The next generation of user experience is artificially intelligent - ZDNet [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- What Jobs Will Artificial Intelligence Affect? - EHS Today [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- Will the next Mozart or Picasso come from artificial intelligence? No, but here's what might happen instead - Ladders [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- Artificial intelligence apps, Parkinsons and me - BBC News [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
- VA launches National Artificial Intelligence Institute to drive research and development - FierceHealthcare [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]