Reddit accuses Perplexity of Stealing its DataHot Buzz

October 23, 2025 16:03
Reddit accuses Perplexity of Stealing its Data

(Image source from: REUTERS)

The increasing dispute between digital content creators and AI corporations has recently escalated significantly. Media sources indicate that Reddit has initiated legal proceedings against Perplexity AI for copyright violation, claiming that the San Francisco-based enterprise unlawfully extracted extensive collections of user-generated content to instruct its AI algorithms. Officially presented on Wednesday in a New York district court, this legal action represents the most current in an increasing sequence of judicial conflicts involving AI developers and platforms accommodating the internet’s most indispensable resource, genuine dialogues between individuals. In addition to Perplexity, Reddit's legal action also references three other organizations: Oxylabs, a data firm based in Lithuania; SerpApi, headquartered in Texas; and AWMProxy, which Reddit describes as a “previous Russian botnet.” As detailed in the formal complaint, these entities supposedly facilitated the acquisition of Reddit's copyright-protected materials by obscuring their identities, concealing their geographical locations, and simulating routine user activities with their data extraction software.

Ben Lee, the leading legal representative of Reddit, described the lawsuit as a reflection of the broader conflict over data created by individuals. He asserted that "AI companies are fiercely competing to acquire superior human-created material," also cautioning that this competition has "stimulated an extensive economic system centered around the illicit processing of data." Reddit asserts that Perplexity functioned as a "eager client" for these services, utilizing content extracted from Reddit—often sourced through Google search results—to enhance its AI-based "response technology." Perplexity, conversely, maintains it has yet to be formally notified of the lawsuit and reiterates its dedication to broadening the availability of knowledge. The organization released a statement asserting, "We are committed to consistently advocating for users' entitlements to unrestricted and equitable access to publicly available information," further affirming its dedication to maintaining "ethical and accountable" practices.

Oxylabs and SerpApi have similarly refuted any misconduct, stating that they have yet to receive formal notification of the proceedings. Denas Grybauskas, an executive at Oxylabs, rebuked Reddit for not attempting to communicate before pursuing legal recourse. He stated, "Reddit did not attempt to engage in direct communication with us," while also stating the company’s intention to protect its standing as a "leading entity in the domain of public data compilation." Anonymous sources informed the Financial Times that Reddit had earlier challenged Perplexity regarding alleged data extraction activities and had also suggested a business collaboration involving monetary compensation to permit authorized data utilization. Nonetheless, Perplexity's creator, Aravind Srinivas, allegedly displayed a lack of enthusiasm for pursuing such a formal arrangement.

According to the reports, Reddit also contacted Google, requesting the tech company to check if Perplexity used its search engine as a way to access Reddit content without permission. This situation contributes to a growing number of lawsuits questioning how AI companies collect training information. Since the emergence of generative AI, news organizations, writers, and online communities have accused technology companies of taking copyrighted text and visuals without approval or fair payment. Interestingly, Reddit itself has made profitable licensing agreements with both Google and OpenAI, allowing them to use its extensive data for training large language models. In sharp contrast, the company is now claiming that Perplexity and its associates avoided these agreements, using sneaky scraping techniques to illegally gather Reddit’s content.

Lee emphasized Reddit’s special importance to AI companies, describing it as “one of the largest and most active collections of human conversation ever created.” This isn’t the first legal action Reddit has taken against an AI company; in June, it filed a lawsuit against Anthropic, alleging the start-up scraped its site more than 100,000 times in under a year. At that time, Anthropic denied any wrongdoing, stating they would “defend ourselves vigorously.” With this new lawsuit, Reddit seems set on making a clear statement: if AI companies want access to human knowledge from the internet, they will need to pay for it.

If you enjoyed this Post, Sign up for Newsletter

(And get daily dose of political, entertainment news straight to your inbox)

Rate This Article
(0 votes)
Tagged Under :
Reddit  Perplexity  Reddit Vs Perplexity