Music publishers sue Anthropic for $3 billion over pirated songs

A coalition of music publishers led by Concord Music Group and Universal Music Group filed a federal lawsuit on January 28, 2026, alleging Anthropic illegally downloaded more than 20,000 copyrighted musical compositions using BitTorrent file-sharing technology from notorious pirate websites. The complaint seeks damages that could exceed $3 billion.

The lawsuit names Anthropic PBC, CEO Dario Amodei, and co-founder Benjamin Mann as defendants. Filed in the United States District Court for the Northern District of California, the complaint alleges the AI company used BitTorrent to acquire millions of pirated books from Library Genesis and Pirate Library Mirror, including hundreds of songbooks and sheet music collections containing copyrighted lyrics owned by the publishers.

"While Anthropic misleadingly claims to be an AI 'safety and research' company, its record of illegal torrenting of copyrighted works makes clear that its multibillion-dollar business empire has in fact been built on piracy," the complaint states.

Separate claims from previous lawsuit

This represents the second major copyright action the music publishers have brought against Anthropic. The plaintiffs previously sued the company in October 2023 over its unauthorized use of 499 musical compositions in training and output from Claude AI models, a case known as Concord Music Group v. Anthropic PBC.

The publishers attempted to amend their original complaint to address the newly discovered torrenting violations after Judge William Alsup revealed Anthropic's BitTorrent activities in a July 2025 ruling in the separate Bartz v. Anthropic case. Anthropic successfully opposed that amendment, arguing the torrenting claims were "entirely unrelated" to the original lawsuit and would "fundamentally transform" the case.

The current lawsuit addresses two distinct categories of alleged infringement. First, it covers Anthropic's downloading and distributing of copyrighted works via BitTorrent from pirate libraries. Second, it alleges ongoing copying of publishers' works in training and output from Claude AI models released after the publishers filed their amended complaint in the first case.

Anthropic released multiple new Claude versions since that initial filing, including Claude 4.5 Sonnet on September 29, 2025, Claude 4.5 Haiku on October 15, 2025, and Claude 4.5 Opus on November 24, 2025. The complaint alleges each of these models was trained using unauthorized copies of the publishers' musical compositions.

BitTorrent downloads from pirate sites

The lawsuit details how Anthropic executives, including Amodei and Mann, personally discussed and authorized the illegal downloading of millions of books from Library Genesis and Pirate Library Mirror using BitTorrent protocols.

In June 2021, Mann personally used BitTorrent to download approximately five million copies of pirated books from LibGen, according to the complaint. Before proceeding, he discussed the plan with Amodei, chief science officer Jared Kaplan, and other senior leadership. Anthropic's Archive Team had deemed LibGen a "blatant violation of copyright," while Amodei himself described the pirate library as "sketchy."

Despite these concerns, company leadership approved the torrenting activity. Amodei admitted at the time that Anthropic "ha[d] many places from which" it could have legally purchased these copyrighted works for training but chose to illegally torrent them instead because it was faster and free, the complaint states. In Amodei's own words, they did so to avoid a "legal/practice/business slog."

In July 2022, Anthropic downloaded millions of additional pirated books from Pirate Library Mirror, a shadow library that mirrored the contents of the shuttered Z-Library. When one Anthropic founder discovered he could torrent additional works from PiLiMi, he messaged colleagues "[J]ust in time!" Another employee responded, "zlibrary my beloved," according to the complaint.

The catalogs of LibGen and PiLiMi bibliographic metadata reveal that among the millions of books Anthropic downloaded were hundreds containing sheet music and song lyrics to musical compositions owned by the publishers. These included large numbers of books published by the publishers' sheet music licensees Hal Leonard and Alfred Music.

Specific titles mentioned in the complaint include The Best Songs Ever featuring "Candle in the Wind" and "Every Breath You Take," VH1's 100 Greatest Songs of Rock & Roll featuring "All Along the Watchtower" and "Good Vibrations," Rolling Stones - Let It Bleed: Authentic Guitar TAB featuring "Gimme Shelter," Elton John - Greatest Hits Songbook featuring "Rocket Man," Creedence Clearwater Revival: Easy Guitar featuring "Have You Ever Seen The Rain," and Harry Styles Songbook featuring "Sign of the Times."

Two-way infringement via BitTorrent

The BitTorrent protocol's peer-to-peer nature creates simultaneous uploading and downloading. When Anthropic downloaded copies of pirated books via torrenting, the protocol simultaneously uploaded to the public unauthorized copies of the same works.

This two-way copying violated both the publishers' exclusive right of reproduction through downloading and their exclusive right of distribution through uploading. The complaint alleges each pirated work Anthropic torrented was likely shared thousands or tens of thousands of times, depriving publishers of substantial revenue.

"Defendants' use of BitTorrent caused extensive harm to Publishers," the complaint states. "Each pirated work Defendants torrented was likely shared thousands if not tens of thousands of times, depriving Publishers of substantial revenue. Defendants also contributed to the continued viability of BitTorrent and pirate libraries as tools for infringement that only exist as long as they have users."

Anthropic copied these books to amass a "vast central library" of written texts the company intended to maintain forever, according to the complaint. While Anthropic has claimed it did not use any of these illegally torrented books to train commercial Claude models, the lawsuit argues the torrenting itself constitutes standalone copyright infringement regardless of subsequent use.

"Regardless of Anthropic's later use, its piracy of these books via BitTorrent was unquestionably infringing," the complaint states. "Even if some subset of the books Defendants illegally torrented were sometimes used for AI training, that cannot excuse their mass torrenting of millions of pirated books without paying for them - including books containing Publishers' musical compositions."

Continued AI training infringement

Beyond the torrenting claims, the lawsuit alleges Anthropic continues to copy publishers' works on a massive scale for AI training even after the first lawsuit was filed. The complaint identifies 20,517 musical compositions that Anthropic allegedly infringed through training newer Claude models and the output those models generate.

Anthropic trains each new Claude model from scratch using newly copied training corpora that include the publishers' copyrighted works, according to the complaint. The company copies this training data from multiple sources, including scraping websites, scanning physical books, and exploiting third-party datasets.

These datasets include The Pile, which incorporates the Books3 collection of pirated books and YouTube Subtitles containing closed captions from videos that capture lyrics to publishers' compositions. Anthropic also exploits the Common Crawl dataset, which contains publishers' copyrighted lyrics scraped without permission from websites of the publishers' licensees including MusixMatch, LyricFind, and Genius.

The complaint details Anthropic's "cleaning" process during training, which removes material inconsistent with its business model but conspicuously does not remove unauthorized copyrighted content such as publishers' lyrics. Instead, Anthropic uses extractor tools to remove copyright notices and other copyright management information from the copied text.

"Anthropic wants to train its Claude AI models specifically on the content of Publishers' musical compositions, including the lyrics, so that the models' output will reproduce that expressive content, rather than copyright notices or other Copyright Management Information accompanying those lyrics, information that is critical to protecting Publishers' rights but which Anthropic deemed useless," the complaint states.

Removal of copyright management information

The lawsuit includes a separate claim under Section 1202 of the Copyright Act for removal or alteration of copyright management information. Song titles, author names, and copyright owner information constitute legally protected copyright management information.

Anthropic has intentionally removed this information both during AI training and in model outputs, according to the complaint. As early as May 2021, high-ranking Anthropic employees including founders Mann and Kaplan discussed extraction tools used to filter training data.

In June 2021, Mann and Kaplan concluded that the jusText extraction tool left too much "useless junk" - such as copyright notice information contained in footers - in scraped web data when compared to other tools. Mann expressed his desire that the AI "model will learn to ignore the boilerplate," like copyright notices.

In one internal chat, an Anthropic staff member shared an example showing that when jusText was applied to a scraped webpage containing footnotes, a copyright owner name, and "© 2019" copyright notice, it left that information untouched. In contrast, the Newspaper extraction tool removed the footnotes, copyright owner name, and copyright notice entirely, which was considered "a significant improvement."

"Because Newspaper removed Copyright Management Information more effectively, Anthropic purposefully decided to employ that tool to remove copyright notices and other Copyright Management Information from Publishers' lyrics and other copyrighted works," the complaint states.

Anthropic deliberately extracts this information to prevent its models from displaying copyright notices alongside publishers' lyrics in outputs, thereby concealing the company's infringement from users, publishers, and other copyright owners, according to the complaint.

Claude models memorize and regurgitate lyrics

The lawsuit alleges Anthropic's Claude models are designed to "memorize" and regurgitate their training data, including publishers' copyrighted lyrics. This tendency is well documented and well known to Anthropic, according to the complaint.

In July 2020, several artificial intelligence researchers at OpenAI - including future Anthropic founders Amodei, Mann, Jack Clark and Kaplan - observed that "a major methodological concern with language models pretrained on a broad swath of internet data, particularly large models with the capacity to memorize vast amounts of content, is potential contamination of downstream tasks by having their test or development sets inadvertently seen during pre-training."

An Anthropic internal report stated more bluntly: "Large LMs memorize A LOT, like a LOT," according to the complaint.

The lawsuit argues these regurgitations are a feature rather than a bug. Anthropic understands that Claude users specifically seek publishers' lyrics and derivatives of those lyrics, and it has developed and trained Claude to respond to precisely those types of requests.

When developing Claude's fine-tuning process, Anthropic hired temporary workers to chat with the AI. In written instructions, Anthropic provided example tasks including "suggesting songs based on your favorite music" or "ask[ing] models to re-write text with style, content, and formatting changes or requests."

Anthropic's own employees frequently prompted Claude for song lyrics and derivatives when developing, testing, and using Claude models, according to the complaint. In January and February 2023, shortly before first releasing Claude to the public, numerous Anthropic employees discussed prompting Claude for copies of publishers' lyrics.

Anthropic founder and chief compute officer Tom Brown queried "@Claude what are the lyrics to desolation row by [Bob] Dylan?" Another employee prompted the model to "write a coherent poem made up of fragments" of "lyrics from the Beatles, Bob Dylan, and other classics from the 60s/70s." A third employee asked Claude "What are the lyrics to we found love by Calvin Harris?"

The complaint alleges that after Anthropic released its models to the public, third-party users made similar requests for publishers' lyrics, and Claude generated responses reproducing those lyrics in violation of publishers' rights.

Inadequate guardrails

After publishers filed their first lawsuit and publicly exposed Anthropic's infringement, the company adopted additional guardrails purportedly designed to minimize AI output copying publishers' copyrighted works. The complaint alleges these guardrails remain inadequate.

Anthropic deliberately chose to include lyrics for only a limited number of specific songs as part of its guardrails, including the 500 works identified in the first lawsuit. The guardrails will not comprehensively prevent output copying lyrics from the much broader universe of copyrighted songs beyond that limited set, according to the complaint.

The guardrails are not designed to block all prompts and output that may copy or contain publishers' copyrighted works, such as requests that Claude generate supposedly "new" or "original" songs in the style of specific artists. Anthropic's models continue to generate output containing publishers' lyrics even when not specifically requested.

Scientific literature confirms Claude will still deliver large amounts of copyrighted content as output despite the guardrails, according to the complaint. Recent research published in January 2026 showed Stanford researchers successfully extracted large portions of copyrighted books from Claude 3.7 Sonnet, with the model reproducing 95.8% of Harry Potter and the Sorcerer's Stone nearly verbatim.

"What's more, because these guardrails address only Claude output, and do nothing to prevent Anthropic's underlying exploitation of Publishers' lyrics in AI training, they are at most a band-aid - not a cure - for Anthropic's infringement," the complaint states.

Financial harm to publishers

The lawsuit alleges Anthropic's conduct has caused substantial and irreparable harm to publishers and their songwriters. Anthropic's use of publishers' works without licenses deprives publishers of license fees and undercuts the entire licensing market.

Anthropic has created a tool, trained on unauthorized copies of publishers' works, that permits users to generate vast quantities of AI-generated lyrics and songs that compete with publishers' legitimate copyrighted works. This competition harms the market for and value of those works.

Distribution through BitTorrent is particularly pernicious, according to the complaint, because each file can be distributed hundreds or thousands of times through the swarm. Anthropic's widescale use of BitTorrent contributes to the continued viability and normalization of that infringing protocol.

The removal of copyright management information makes it harder for publishers to enforce their copyrights and protect their works from further exploitation. The sheer breadth and scope of Anthropic's copying makes it "effectively impossible to measure, calculate, or even estimate the financial damage it imposes on songwriters and publishers," the complaint states.

Anthropic is now valued at $350 billion or more, up from $183 billion in September 2025. The company has received billions of dollars in funding from Amazon, Google, and other investors. Anthropic nearly doubled its valuation in just two months from September to November 2025.

The complaint alleges one of the main reasons Anthropic's AI models are popular and valuable is because the company trained those models on a text corpus that includes publishers' copyrighted lyrics. Publishers' copyrighted content serves as a draw for individual users, commercial customers, and investors.

Willful infringement allegations

The lawsuit alleges Anthropic's infringement is willful, intentional, and purposeful, in disregard of and with indifference to publishers' rights. Anthropic knows it is using plaintiffs' works without permission, yet has trained and publicly released multiple new versions of Claude since publishers filed their first lawsuit.

Anthropic closely monitors and analyzes user interactions with Claude and the output generated by Claude. The company collects user prompts and corresponding outputs to study specific ways Claude is being used. Anthropic is well aware based on this study that users request lyrics to publishers' works and that Claude delivers copies of those lyrics.

Analysis of Claude usage data has revealed clusters of requests to "help me find, analyze, or modify song lyrics," "translate songs or lyrics between languages," and "help me identify or find songs with specific characteristics," according to the complaint. These clusters include Claude prompts and output relating to publishers' lyrics specifically.

Anthropic implemented guardrails after the first lawsuit because it knew its models had been trained on publishers' lyrics, it monitored Claude user activity and output, it understood users were prompting models regarding publishers' lyrics, and it knew the models were generating specific output that unlawfully copied publishers' lyrics.

"When Anthropic first developed and later refined and expanded these guardrails, and when it monitored the effectiveness of the guardrails, it collected and analyzed Claude prompts and output data, including specific infringing output copying copyrighted works," the complaint states.

Prior legal context

This lawsuit arrives amid intensifying copyright litigation across the AI industry. Anthropic agreed to a $1.5 billion settlement in September 2025 to resolve copyright infringement claims from authors over its use of pirated books to train Claude models.

That settlement in the Bartz case became the largest publicly reported copyright recovery in history. The case centered on roughly 500,000 published works, with authors receiving approximately $3,000 per work.

Judge William Alsup delivered a landmark split decision in June 2025 ruling that using copyrighted books to train large language models constitutes transformative fair use under copyright law. However, he allowed claims over pirated content to proceed to trial, finding that Anthropic's method of acquiring content through piracy was not protected.

"The judge understood the outrageous piracy," Authors' Guild CEO Mary Rasenberger said at the time. "The piracy liability comes with statutory damages for intentional copyright infringement, which are quite high per book."

The music publishers reached a partial agreement with Anthropic in January 2025 regarding copyright protection measures in their first case. That stipulation required Anthropic to maintain existing "Guardrails" designed to prevent copyright-infringing outputs and established a notification process for publishers to alert Anthropic when guardrails fail.

However, that agreement addressed only one portion of the publishers' preliminary injunction motion and did not resolve disputes regarding the use of copyrighted lyrics in training future AI models.

Industry implications

The lawsuit highlights tensions within the AI industry regarding content licensing and fair use. Publishers in the complaint note they have already begun to explore and enter into licenses permitting authorized uses of their musical compositions in connection with AI.

Universal Music Publishing Group recently entered into agreements with AI music generator Udio and AI music technology company KLAY to license certain works in connection with AI training, according to the complaint. Other large music publishers have similarly licensed AI companies to use their works.

"Publishers recognize the great potential of ethical AI as a powerful tool for the future, and have already begun to explore and enter into licenses permitting authorized uses of their musical compositions in connection with AI," the complaint states. "However, it remains crucial that AI technology be developed and employed ethically and responsibly, in a manner that protects the rights of Publishers and songwriters, their livelihoods, and the creative ecosystem as a whole."

This case arrives as multiple AI companies face similar copyright challenges. Ziff Davis filed a major lawsuit against OpenAI in April 2025 accusing the company of unauthorized use of content from 45 properties including CNET, IGN, and Mashable. Reddit sued Anthropic in June 2025 over alleged unauthorized AI training on platform data.

The Copyright Office released major guidance in May 2025 addressing when AI developers need permission to use copyrighted works. The report suggested that transformativeness and market effects would be the most significant factors in fair use determinations.

Congress introduced the TRAIN Act in July 2025, which would grant copyright owners subpoena power to identify works used in generative AI training without their permission or compensation.

The publishers seek statutory damages under the Copyright Act, which provides for damages up to $150,000 per work for willful infringement. With 20,517 works identified in Exhibit B alone, statutory damages could theoretically reach into the billions of dollars.

The complaint also seeks injunctive relief requiring Anthropic and its officers, including Amodei and Mann, to cease infringing publishers' copyrights. Publishers request an order requiring Anthropic to provide an accounting of training data, training methods, and known capabilities of its AI models.

Additionally, publishers seek an order requiring defendants to destroy under court supervision all infringing copies of publishers' copyrighted works in defendants' possession or control.

The plaintiffs are represented by extensive legal teams including Oppenheim + Zebrak LLP, Coblentz Patch Duffy & Bass LLP, and Cowan, Liebowitz & Latman P.C. The complaint was filed in the United States District Court for the Northern District of California and demands a jury trial.

Anthropic did not respond to requests for comment.

Timeline

January-February 2021: Anthropic cofounders begin downloading pirated books from Books3 containing 196,640 unauthorized copies
June 2021: Benjamin Mann downloads at least 5 million pirated books from Library Genesis using BitTorrent
July 2022: Anthropic downloads at least 2 million books from Pirate Library Mirror
October 18, 2023: Concord Music Group and other publishers file first copyright lawsuit against Anthropic over unauthorized use of 499 musical compositions
January 2, 2025: Concord Music and Anthropic reach partial agreement on copyright protection measuresincluding guardrails for AI outputs
June 23, 2025: Judge William Alsup issues split ruling finding fair use for AI training but allowing piracy claims to proceed
July 2025: Judge Alsup's ruling in Bartz v. Anthropic publicly reveals Anthropic's BitTorrent torrenting activities
August 18, 2025: Anthropic appeals largest copyright class action certification in Bartz case
September 5, 2025: Anthropic agrees to $1.5 billion settlement in Bartz case, largest copyright recovery in history
September 29, 2025: Anthropic releases Claude 4.5 Sonnet to the public
October 15, 2025: Anthropic releases Claude 4.5 Haiku to the public
November 24, 2025: Anthropic releases Claude 4.5 Opus to the public
January 6, 2026: Stanford researchers extract copyrighted books from Claude 3.7 Sonnet, reproducing 95.8% of Harry Potter nearly verbatim
January 28, 2026: Music publishers file second lawsuit against Anthropic over BitTorrent piracy and ongoing AI training infringement

Summary

Who: Concord Music Group, Universal Music Group, ABKCO Music, and other major music publishers filed the lawsuit against Anthropic PBC, CEO Dario Amodei, and co-founder Benjamin Mann in the United States District Court for the Northern District of California.

What: The publishers allege Anthropic illegally downloaded more than 20,000 copyrighted musical compositions using BitTorrent from pirate websites Library Genesis and Pirate Library Mirror, including hundreds of songbooks and sheet music collections. The lawsuit also alleges ongoing copyright infringement through training of newer Claude AI models and the outputs those models generate. Claims include direct copyright infringement through torrenting, ongoing infringement in AI training and output, contributory and vicarious infringement through Claude users, and removal of copyright management information.

When: The lawsuit was filed on January 28, 2026. The alleged torrenting occurred primarily in June 2021 and July 2022. The ongoing AI training infringement involves Claude models released from September 29, 2025, through November 24, 2025, and models currently in training.

Where: The case was filed in the United States District Court for the Northern District of California. Anthropic is headquartered in San Francisco. The alleged BitTorrent downloads occurred from overseas pirate library websites Library Genesis and Pirate Library Mirror. The copyrighted works include musical compositions owned by publishers with operations in Nashville, Santa Monica, New York, London, and Stockholm.

Why: This lawsuit matters for the marketing community because it represents one of the largest non-class action copyright cases in U.S. history, with potential damages exceeding $3 billion. The case addresses fundamental questions about how AI companies can legally acquire training data and whether current guardrails adequately protect copyrighted content. The outcome will influence licensing markets for creative content, affect the viability of AI-powered marketing tools, and potentially reshape how AI companies approach content acquisition. As AI systems become more deeply integrated into marketing workflows, precedents established by this case will determine whether companies can rely on AI platforms that trained on potentially infringing content and what legal risks they face when using such tools.