Tech Companies Using 183,000 Books To Train AI. The Problem is...
NDTV
Many authors took to social media to express their outrage and shared screenshots which showed that their copyrighted novels were part of the list.
Nearly 200,000 books are being used by some of the biggest companies in technology to train their generative AI models, according to a report by The Atlantic. Books by famous authors including J.K. Rowling, Amitav Ghosh, Rupi Kaur, and Neil Gaiman are part of a dataset of pirated books known as Books3. However, no one has told the authors. Here to report a theft. I spent three decades of my life to write my books. The Al large language models did not "ingest" or "scrape""data." Al companies stole my work, time, and creativity. They stole my stories. They stole a part of me. pic.twitter.com/tpFL2x9jgt
The collection of books includes erotic fiction to prose poetry genres. The report says that these books help generative AI systems with learning how to communicate information.
CNN report said that some AI training text can be pulled from articles that are posted on the internet. Books3 is already the subject of multiple lawsuits against Meta and other companies using the system to train AI.