AI Makers Get Thumbs Up to Use Books Without Permission

AI Makers Get Thumbs Up to Use Books Without Permission

John Lister's picture

An AI company acted lawfully by training its models on published books without the author's consent, a judge has ruled. But Anthropic will be on the hook for downloading more than seven million pirated books to use in the training data.

The case was brought by three authors who said Anthropic had breached their copyright by adding the text of their books to a training database for Claude. That's a large language model (LLM), which works a little like autocorrect on a phone's texting tool.

The difference, other than the speed and power, is that the LLM doesn't predict words based on one person's message history, but rather on massive databases of written documents (i.e., trillions of words, versus millions used in autocorrect).

Fair Use Exemption

The judge agreed with Anthropic's argument that it was covered by a fair use exemption to copyright law. That's usually only allowed when the use involves transforming the copyrighted material rather than simply reproducing it.

According to the judge, the correct analogy in this case was to think of Anthropic as a reader who wanted to become a writer and so studied existing books for inspiration. He also noted there was no evidence the process had resulted in Claude being used to produce "infringing knockoffs" of the original books. (Source: bbc.co.uk)

It may not be the final word on the subject as several similar cases have been brought against other LLM developers. Legal analysts believe it's likely a case will eventually wind up in the Supreme Court to set a precedent on the fair use issue. (Source: theguardian.com)

Paying Price For Piracy

Anthropic hasn't escaped punishment, however. The case revealed that it downloaded and stored a huge number of pirated books as part of the data collection. It will now likely face legal damages for copyright infringement as the downloading effectively involved making copies of the books. The company did later buy printed copies of some of the books. The judge said that doesn't change their legal responsibility for the infringement, but could affect the amount awarded in damages.

What's Your Opinion?

Do you agree with the verdict? Is any text fair game for AI technology? Does it make a difference that these were books rather than free-to-read web pages?

Rate this article: 
Average: 4.7 (9 votes)

Comments

Dennis Faas's picture

If all 7 million authors were still alive and they all charged $1 for copyright infringement, that would be $7 million in damages. The reality is that most authors are probably deceased and (according to ChatGPT) a likely charge would be $150k per willful infringement for those willing to pursue legal justice. This could be an expensive lesson for Claude!

Speaking from experience, my material on this website gets lifted (word for word) and placed on other websites on a daily basis. There is really nothing you can do about people stealing your work because it often originates from different countries and is difficult to prove exactly who is behind the infringement (especially on a forum with anonymous users, for example).

edtsinc_15387's picture

My wife and I are voracious readers and buyers of books as are my children and grandchildren. We BUY all our books, magazines AND music. I had someone bad mouth one of my most successful software programs to one of my customers and then he used that same program with HIS customers! Needless to say, that didn't end well for him. (His boss and owner of the company was a friend of mine.) Limited though it was, first hand experience with pirating and I HATED it. I cannot imagine the scale of grief these people are giving to successful artists, authors, musicians, etc.. I cannot even begin to understand how much real money should be levied against ANY pirate be it business or individual! FAIR USE!?!?!?! Ha ha hah!!! They are far worse than the plagiarizers who get embarrassed (or not!). Aaarrrggghhh...

Chief's picture

Since we now live in an electronic world where pirating is as easy as CNTRL-C and CONTRL-V, it is no surprise to see plagiarism run rampant in a world where rules mean absolutely nothing unless the powers-that-be decide to care, and even then, are not very successful as it has become a game of whack-a-mole.

I'm not saying it's right.
I'm not advocating anything.
Just calling it how I sees it.

No easy answers, just intelligent solutions - and few understand the latter.