Reddit Sues AI Companies Over Data Scraping

Reddit Sues AI Companies

Reddit filed a lawsuit against Perplexity AI and three other companies for allegedly stealing user data through unauthorized scraping operations.

Context

Reddit possesses valuable user conversation data that AI companies need to train their chatbots and improve their products. The platform has licensed its data to companies like OpenAI and Google through formal agreements, but other firms have reportedly obtained Reddit content without permission or payment through data scraping – the automated collection of information from websites using computer programs called bots or scrapers.

Lawsuit Filed

On Wednesday, Reddit filed a lawsuit in federal court in Manhattan, alleging that four companies engaged in illegal data collection from its platform. The lawsuit named Perplexity AI, a San Francisco-based search engine company, along with three data scraping firms: Lithuanian company Oxylabs, Russian company AWMProxy, and Texas-based SerpApi.

According to the complaint, the three companies collected Reddit data by extracting it from Google search results, then packaged and resold that information to AI companies. Reddit alleged that Perplexity purchased data from at least one of these scraping operations rather than negotiating a direct licensing deal with Reddit.

Reddit’s Claims

Reddit sought monetary damages and a court order to halt the scraping operations.

Ben Lee, Reddit's chief legal officer, said in a statement that AI companies are engaged in an “arms race for quality human content” that has created an “industrial-scale ‘data laundering’ economy.” He added that Reddit is a prime target because it represents “one of the largest and most dynamic collections of human conversation ever created.”

Company Response

Representatives from the accused companies denied wrongdoing. Perplexity stated it would “fight vigorously for users’ rights to freely and fairly access public knowledge” and called the lawsuit a "show of force" in Reddit's data negotiations with other companies.

SerpApi said it “strongly disagreed” with the allegations. Oxylabs said it was “shocked and disappointed” by the suit, claiming that no company should claim ownership of public data. AWMProxy could not be reached for comment.

Reply

or to participate