AI, Anthropic, and What the Settlement Means for Authors

This news is nearly a year old , but it’s a precedent-setting case, worth going over again, especially as the legal framework around genAI continues to evolve.

A black toy robot on wheels sits on a desk, with a laptop and a pink container holding pens near a window. — Our robot overlords got sued. (Kindel Media / Pexels.com)

The case of several large publishers against Anthropic was one of the first to make its way through the courts. The judge delivered the ruling in June 2025. The Anthropic settlement was announced in August 2025.

I definitely don’t agree with everything that was set out in the case, although I can see why the judge came to the conclusions they did. I do think there were some positives in the case. And I also think there’s plenty of room for revision and stricter regulation around how genAI models are trained. (Personally, I think they could all be turned off and humanity wouldn’t be the worse for it, but I’m not a tech bro evangelist.)

This case got a lot of people talking—not necessarily about the Anthropic settlement, but more about negligence by publishers. Again, I think the door is open for further suits. It remains to be seen if anyone else agrees.

So, let’s dive in and see what this ruling could mean for authors, publishers, and even developers.

Generation Is Transformative

The judge in this case ruled that whatever Claude AI spits out is a “transformative work.” I highly disagree with this position, given that genAI is effectually a predictive text generator at scale. These models work, largely, by predicting what the next word is most likely to be.

In that sense, the models are not “intelligent.” They do not know things, even if they seem to “give” the right answer some of the time. (And whether they get the answer correct is another story. AI hallucinations are a problem.)

Given the lack of intelligence, an AI model is not purposefully remixing or reinventing anything. It is simply looking at everything that it has been fed and saying based on pattern analysis, this is most likely to be the next word.

The speed and scale of this prediction is relatively impressive. This is also why AI-generated texts tend to veer off in strange directions, read in stilted ways, and so on. The AI is not “thinking” about narrative the way we do (if we can say it’s thinking at all, to be honest), so it can’t construct a narrative.

Ergo: an AI-generated text is not transformative.

Nonetheless, the judge ruled that AI word salad is transformative. He did suggest that any individual output could be challenged in court. That puts burden on individuals to sue over individual generations. It could create a legal burden for companies like Anthropic, especially if publishers, authors, or other creators decided to launch multiple suits. Many of them might be dismissed as frivolous or without merit. Yet that situation could create a legal mire for AI companies, which could further undermine their questionable profitability.

Storing Works Is Copyright Infringement

The judge’s ruling wasn’t firmly on the side of Anthropic, however. While the mixed ruling suggested that whatever the AI generates is “transformative” enough not to infringe copyright, as a general rule, the storage of copyrighted works was infringement. In this, the judge found that Anthropic had violated copyright for nearly seven million works, which it had scraped from known pirate sites.

The judge ordered that the copies of these works housed in Anthropic’s training databases should be destroyed. This makes the legal framework a touch more tricky for companies to navigate.

At first blush, this doesn’t look like much of an issue. Claude AI is already trained, so it doesn’t need further access to these works. Deleting them from the database should not impact the function of the current Claude AI model.

Deleting Is Poor Practice for AI Training

The rule of thumb for AI training databases, however, is that you should not delete anything from the training set. There are a few reasons.

The first is that you usually want to have access to the original training database. The deletion of up to seven million works would have a fairly profound effect on the integrity of that database. That might cause issues when you want to refine, update, or create a new version of the current model.

The other issue is that developers typically retain the training database to use with future models. Creating and maintaining a database takes resources, like time and money, even when you’re pirating most of what you’re storing in the database.

Deleting millions of works threatens the integrity of the database, which may mean it isn’t as useful in training future models. Worst case scenario: the database isn’t useful, and the developers need to basically create a new database.

In a best case scenario, this makes stealing works to put in your database unappealing. Even if you train a successful model with pirated works, you may be sued and forced to delete part of that database—which, in turn, craters the return on investment you’re getting out of putting the database together.

It will be up to companies like Anthropic to weigh their options and come to the conclusion that paying to license works for storage in their training bases is actually the more economic way to go. The question is whether they do come to that conclusion.

Who Is Getting a Payout from the Anthropic Settlement?

The biggest reveal around the Anthropic settlement was that a number of large publishers had been negligent in registering works with the US Library of Congress.

The LOC offers copyright registration for a fee. What’s important to understand here is that registration is completely optional. Under the current copyright law, copyright exists intrinsically. That means the second you create an original work, you own the copyright on it. You do not need to do anything.

A small black gavel with a gold band sits on a matching rest near a pair of black binders, on top of a black tabletop surface. — (Sora Shimazaki / Pexels.com)

LOC registration exists to make your copyright easier to defend in court. While your copyright is intrinsic, registering your work with LOC creates a legitimate record of when and where the work was created, what the work was, and who owns the copyright. It’s then very easy to look this information up.

This is supremely handy if you happen to get sued for copyright infringement. It’s also useful if you want to sue someone else for infringing your copyright. Most big publishers know this. Many of them even have in their contracts that they will register the work with LOC, even though it’s not strictly necessary.

The Anthropic Settlement Is a Missed Opportunity Due to Publisher Laziness

What came to light in this case was that many publishers were neglecting their duty, including heavy-hitters like PRH. In this case, that negligence bit them in the ass. Of the seven million pirated works, only about 500,000 were eligible to receive compensation under the Anthropic settlement.

Why was that? The judge ordered that only works registered with the LOC would be considered for compensation. Suddenly, it was a big problem if you hadn’t registered the works with LOC. It was an even bigger problem if you’d told the author(s) that you would.

So, we can see how the publishers got some bad press here. Anthropic was ordered to pay up to $150,000 per work—but six and a half million works were ineligible.

Copyright Was Still Infringed for All Seven Million Stolen Works

Now, that’s not to say the remaining works were fair game. After all, the judge ordered the deletion of all seven million works. Rather, the decision to use LOC to guide who was eligible for compensation was likely based on expediency. There were seven million works to consider. As I noted, copyright is pretty easy to prove if it’s registered with LOC. The judge likely looked at the sheer number of works and realized they could be there forever, trying to hash out ownership of copyrights. Using LOC registration records expedited that process.

That doesn’t mean the copyright owners of the remaining six and a half million works did not have their copyright infringed. It doesn’t mean copyright owners can’t still sue Anthropic. What it means is that they need to start a separate suit in which they prove their ownership of the copyright over these works.

That is another legal nightmare, for both sides. And since it requires publishers to hash out their ownership, it could be a lengthy process, which they’re unlikely to pursue. Individual authors are also unlikely to pursue this avenue. It would take time and resources they probably don’t have, in addition to pitting them against corporate lawyers from a large multibillion-dollar company.
So: it’s unlikely we’ll see a spate of additional suits from either publishers or authors as they attempt to pursue compensation for works not included in the settlement of this case. That is an option that remains open.

A Step Toward a Pay to Play Model

What’s more likely to happen here is that publishers are going to approach companies like Anthropic with a payment scheme that allows storage of copyrighted works in a database. In all likelihood, this is what the publishers were after in the first place. The settlement and the order to delete the pirated works is nice. There’s a good chance Anthropic will attempt to save their database or build a new one by asking for permission this time.

Something similar happened when publishers took Google to court over Google Books, all the way back in 2008. Google had promised to only add public domain works to its repository. The company worked with university libraries to scan in their collections. Their argument was that they were helping to preserve old and rare books.

Somewhere in that process, they decided it was easier to ask for forgiveness than to ask for permission, so they started wholesale scanning copyrighted works into Google Books. Unsurprisingly, publishers sued. In the settlement, Google agreed to a licensing scheme, which publishers could opt into.

I’m envisioning we’ll see something similar as the ultimate outcome of the Anthropic settlement and other similar cases. Under such a scheme, publishers and authors could “opt” to provide permission to having their copyrighted works used for AI training purposes, stored in a database. This would arguably provide some compensation to the rights holders. I’ve no doubt that publishers are looking at it as a potentially lucrative income stream.

Compensation Is Likely to Be Laughable

The downside is that the compensation paid is unlikely to be even a fraction of what a work is truly worth. If it approached any level of fairness, it’s likely that AI companies would simply opt to keep stealing instead. Running the risk of getting sued and needing to delete a good chunk of a database would be cheaper. So any agreement worked out to allow the legal storage of copyrighted works will necessarily be unfair to creators.

An overhead shot of several coins, mostly American pennies, in a glass jar, sitting on top of a white wooden surface. Any payouts authors gets from AI training inclusion schemes will likely be chump change. — Photo by Miguel Á. Padriñán on Pexels.com

Some people might argue that hey, at least it’s something, which is true to some degree.

The only upshot to this would be if authors could truly opt out. That would mean if they did not consent to their work being used, it would not be included.

That, however, requires us to put a lot of faith in the idea that AI companies are going to do the right thing. And they’ve already shown they’re not exactly operating on an ethical basis …

How AI Companies Will Get Around This

I am not going to be surprised if Anthropic keeps “whoops,” finding copies of those seven million works in their databases. The settlement requires them to delete them, but the question is, will they? And beyond that: will we even know if they do? Who is checking on this? Who is going to monitor them? Do they have multiple copies stored in different places? If they delete them from one, do they delete them from all?

More likely, there are copies stored in multiple places. Anthropic, whether maliciously or not, is unlikely to delete them all. That means, even if people are monitoring, it’s likely that the company will get “caught” with copies. They’ll simply say, “Whoops, we missed that copy, deleting now!”

The same is true with regard to other aspects of the Anthropic settlement. Will AI companies agree to a pricing scheme that will grant them ongoing permission to store copyrighted works? Or will they decide that it’s easier and cheaper to keep stealing? What about respecting the decisions of authors or publishers who want to opt-out? Will they make sure they don’t include those works, or will this be another “whoops, our bad!” moment?

To date, AI companies have given us no reason to trust them. Even with the Anthropic settlement, there are all kinds of holes in the evolving legal framework.

With that said, there are also holes that allow creatives to continue to go after AI companies for unauthorized use of their work, which means we can continue to be a thorn in their sides.

One thing is for sure: the legal landscape will keep evolving. Creatives may not have been offered fair compensation for their works in this round, but the fight’s not over yet.

AI, Anthropic, and What the Settlement Means for Authors

Generation Is Transformative

Storing Works Is Copyright Infringement

Deleting Is Poor Practice for AI Training

Who Is Getting a Payout from the Anthropic Settlement?

The Anthropic Settlement Is a Missed Opportunity Due to Publisher Laziness

Copyright Was Still Infringed for All Seven Million Stolen Works

A Step Toward a Pay to Play Model

Compensation Is Likely to Be Laughable

How AI Companies Will Get Around This

Related

About the author

Cherry

Recent Posts

Archives

Generation Is Transformative

Storing Works Is Copyright Infringement

Deleting Is Poor Practice for AI Training

Who Is Getting a Payout from the Anthropic Settlement?

The Anthropic Settlement Is a Missed Opportunity Due to Publisher Laziness

Copyright Was Still Infringed for All Seven Million Stolen Works

A Step Toward a Pay to Play Model

Compensation Is Likely to Be Laughable

How AI Companies Will Get Around This

Related

About the author

Cherry

Read more

Recent Posts

Archives