Skip to content
Accueil » United States: AI can train with books, even copyrighted ones!

United States: AI can train with books, even copyrighted ones!

copyright-book-ai-data

A federal court in San Francisco has issued a landmark decision regarding the use of copyrighted books to train artificial intelligence models. This case, involving the company Anthropic and its AI model Claude, could well reshape the landscape of technological development and intellectual property in the United States.

But what does this decision mean for the future of AI and content creators?

A victory for AI, under the umbrella of “fair use”

The core of this decision rests on the interpretation of the “fair use” doctrine, a key principle of American copyright law that allows limited use of protected content without the authorization of rights holders, under certain conditions.

According to the judge, using books, whether legally acquired or not, to train machine learning algorithms, like Claude, falls under fair use. This practice, he estimated, promotes technological innovation and spectacular advances in the field of artificial intelligence, without causing unreasonable harm to authors.

This position marks a significant step for technology companies. Indeed, training AI models requires massive amounts of textual data, often drawn from books, articles, or other protected works. This data is crucial for the process of supervised and unsupervised learning, allowing models to generalize and improve their performance on various tasks.

Until now, the use of such content raised complex ethical and legal questions, with authors and publishers denouncing unauthorized exploitation of their works.

With this decision, the court seems to open a legal path for AI companies, recognizing that model training constitutes a transformation of original works, distinct from their direct reproduction, and that this transformation falls within the framework of neural network optimization and deep learning.

A clear limit: mass piracy is not tolerated

However, the judge did not give Anthropic a blank check. While he validated the use of books for model training, he firmly condemned the practice of downloading millions of pirated books to build a permanent digital library.

Such an approach, according to the court, oversteps the bounds of fair use and violates authors’ rights. This distinction is crucial: it establishes a boundary between the temporary use of data to develop algorithms and the creation of illegal archives of protected content.

This nuance could have significant repercussions for technology companies.

While AI training is now better protected under fair use, questionable practices, such as the systematic use of illegal sources, remain under judicial scrutiny. Companies will therefore need to be extra vigilant to ensure their data collection methods comply with legal frameworks.

Companies like Anthropic, but also OpenAI or xAI, could benefit from this legal clarification to accelerate their research and innovations. This could also stimulate competition in the sector, by allowing more players to access training data without fear of immediate legal action.

Read also: How does ChatGPT learn? Discover its workings

AI: A threat to publishers?

On the other hand, authors and publishers might see this as a threat.

If fair use authorizes the use of their works without direct compensation, this could reduce their potential income, especially in a context where AI is already generating competing content. Some fear that this decision could set a precedent favoring tech giants at the expense of individual creators.

Voices are already rising to demand copyright reform, to better protect works in the age of AI.

The tension between technological innovation in AI and the protection of intellectual property is far from resolved.

In the meantime, tech companies will have to navigate cautiously, ensuring their practices respect the limits set by the courts. For authors, this decision could be an invitation to rethink their economic models.

Some might consider partnerships with AI companies, making their works available for compensation, while others might intensify their efforts to protect their rights. In any case, this verdict marks a turning point in how AI interacts with the creative world.

Source: LeFigaro

Leave a Reply

Your email address will not be published. Required fields are marked *

Glen

Glen