lang icon English
Aug. 22, 2023, 5:30 p.m.
476

None

Brief news summary

None

According to an article in The Atlantic, works by a multitude of authors, including Margaret Atwood, Haruki Murakami, and Jonathan Franzen, have been utilized to train artificial intelligence tools. Companies such as Meta and Bloomberg have inputted over 170, 000 titles into their models, including Meta's LLaMA and Bloomberg's BloombergGPT. These models, along with others like OpenAI's ChatGPT and EleutherAI's GPT-J, generate content based on patterns found in sample texts. The dataset, called Books3, consists of approximately one-third fiction and two-thirds nonfiction works, with the majority being published in the last two decades. Notable authors in the dataset include Zadie Smith, Stephen King, Rachel Cusk, Elena Ferrante, Margaret Atwood, Haruki Murakami, bell hooks, Jonathan Franzen, Jennifer Egan, David Grann, George Saunders, Junot Díaz, Michael Pollan, Rebecca Solnit, Jon Krakauer, L Ron Hubbard, and John MacArthur.

Recently, three writers—Sarah Silverman, Richard Kadrey, and Christopher Golden—filed a lawsuit claiming that their copyrighted works were used in Meta's LLaMA training. OpenAI has also faced accusations of training their models on copyrighted works, as indicated by a 2020 document mentioning internet-based book corpora, specifically Books2. The origins of the data appear to be tied to shadow libraries such as Library Genesis (LibGen) and Z-Library. Shawn Presser, the independent AI developer who initially created Books3, expressed concern about large companies having control over this technology despite understanding authors' concerns. While Meta declined to comment on their use of Books3, a Bloomberg spokesperson confirmed the company's usage but stated that future versions of BloombergGPT would not incorporate the Books3 dataset.


Watch video about

None

Try our premium solution and start getting clients — at no cost to you

I'm your Content Creator.
Let’s make a post or video and publish it on any social media — ready?

Language

Hot news

Nov. 2, 2025, 1:33 p.m.

Shoppers Shift Budgets and Embrace AI Ahead of Ho…

As the holiday shopping season nears, small businesses prepare for a potentially transformative period, guided by key trends from Shopify’s 2025 Global Holiday Retail Report that could shape their year-end sales success.

Nov. 2, 2025, 1:29 p.m.

Meta's AI Research Lab Releases Open-Source Langu…

Meta’s Artificial Intelligence Research Lab has made a notable advancement in fostering transparency and collaboration within AI development by launching an open-source language model.

Nov. 2, 2025, 1:26 p.m.

Ethical Considerations in AI-Driven SEO Practices

As artificial intelligence (AI) increasingly integrates into search engine optimization (SEO), it brings significant ethical considerations that must not be overlooked.

Nov. 2, 2025, 1:24 p.m.

Deepfake Livestream Misleads Viewers During Nvidi…

During Nvidia’s GPU Technology Conference (GTC) keynote on October 28, 2025, a disturbing deepfake incident occurred, raising significant concerns about AI misuse and deepfake risks.

Nov. 2, 2025, 1:17 p.m.

WPP Launches AI-Powered Marketing Platform for Br…

British advertising firm WPP announced on Thursday the launch of a new version of its AI-powered marketing platform, WPP Open Pro.

Nov. 2, 2025, 1:15 p.m.

LeapEngine Enhances Marketing Services with AI To…

LeapEngine, a progressive digital marketing agency, has significantly upgraded its full-service offerings by integrating a comprehensive suite of advanced artificial intelligence (AI) tools into its platform.

Nov. 2, 2025, 9:29 a.m.

Sora 2 Faces Legal Challenges Amid AI Video Gener…

OpenAI’s latest AI video model, Sora 2, has recently faced substantial legal and ethical challenges following its launch.

All news

AI Company

Launch your AI-powered team to automate Marketing, Sales & Growth

and get clients on autopilot — from social media and search engines. No ads needed

Begin getting your first leads today