BLOG

SOMETIMES, WE EXPRESS!

Is Web Scraping Legal? A Million-Dollar Mystery, Demystified!

  • 12/12/2016

While in many of our articles we have mentioned how useful and convenient data scraping and web crawling services are, the real question still lingers on the minds of many- Is it even legal? Well, that’s a million-dollar question! No, literally it is. You can either save or pay a million dollars in penalty if you try to play the game without knowing the rules.

Many startups and entrepreneurs have sworn by the high amount of ROI, data scraping and web crawling has generated for them. Though the processes may sound fairly new and modern, web crawling was very much present even during the dotcom’s mid-life crisis and bubble burst. Yes, that’s around the Y2K period. And what’s more is that the function of web crawling and inherent legality is being battled out since over a decade and half now.

To keep things straight and understandable, here are snippets of milestone cases which led to major reforms in determining the legality of web crawling as a service.

eBay: The first round of shots fired

In around 2000, eBay fired it first round of legal shots towards bidder’s edge. Reason? Bidder’s edge was an auction data aggregator which relied solely on scraping auction data off eBay’s website and violated the Trespass to Chattel’s Doctrine (refers to an infringement on a private property). While the settlement was done out of court, the judge acknowledged the fact that heavy bot traffic can disrupt the services offered by the victim.

Intel Corp vs Hamidi: Hit Ground Zero.

Three years later, in 2003, a similar case was filed against Hamidi by Intel Corp. However, the California Supreme Court overturned the established ruling as in the case of eBay vs Bidder’s edge stating that the Trespass to Chattel as a doctrine was not applicable in the context of computers wherein no real damage was ensued towards a personal property.

Things are back to square one now. What followed was a series of cases where a mere Terms of Use disclosure was rued as insufficient to prohibit scrapers.

Facebook: Turn the Tides, As Always

In 2009, Facebook sued a certain power.com which aggregated multiple social networking sites on a centralized website. This time the petition was not under the charge of Trespass to Chattel, but for copyright charges. In denial of power.com’s petition to dismiss the case, the court asserted that copying in any quantum is unpardonable in the context of copyright infringements.

AT&T: Big Enough to Let Go

In 2010, hacker Andrew “Weev” Auernheimer exploited a flaw in AT&T’s security system and extracted user email addresses to create a database. Though it was clearly a case off ignorance on the part of AT&T and Andrew had not caused damage nor extracted any private information, the court convicted Andrew on grounds of breach.

2013: MeltWater Melts Away

Meltwater, a software company whose “Global Media Monitoring” product aggregates global media content for their paid clients was sued by the Associated Press and convicted on grounds of copyright infringement and hot news misappropriation.

QVC vs Resultlty: A Surprising Result

In 2014, Resultly a shopping app startup was sued by QVC – a well-known television retailer for excessively crawling its site. However, fortunately good sense prevailed in the court and Resultly was not convicted. The court asserted the intent of Resultly was not to cause damage to QVC but just to build a user base for general benefit.

Key Takeaways:

While the legality of web crawling as a function is pretty much in the grey area, it is quite evident that violation of associated laws and malicious intent is a key parameter to decide during a verdict.

The best part about working with botscraper is that the process is so ingeniously designed so as to abide by the pettiest of the law and protect interests while optimizing and maximizing results and extractions.

Botscraper understand Terms of Use disclaimers and abides by the rules and ethics of the internet. Also, the bot is polite enough to request data and not attack or penetrate the webpage enough to inflict damages or nuisance. Botscraper understands and respects robots.txt and extracts only as much as permitted. Botscraper is intelligent enough to identify copyright content and prudent enough to abide by the fair use code of conduct. Also, botscraper ensures that while it focuses of maximum data extraction, it does not cause disturbance to the website’s traffic.

What’s better than having top class and highly insightful data? Having it all legally! That is definitely where Botscraper makes its debut in your strategic meet. It’s not just about the data, it’s about everything related to data.

Is scraping legal? This may be a million-dollar question, but Botscraper will work to only help you generate those millions! Legal or not, nobody knows; but Botscraper knows how to be the best and play by the rules at the same time.

 


Get A Quote