News
>
Challenges of Data Access for Generative AI Models Highlighted in New Report

Sept. 2, 2024, 7:12 a.m.

Challenges of Data Access for Generative AI Models Highlighted in New Report

Generative AI models rely on large training data sets, typically composed of public data from the internet. However, organizations are increasingly restricting access to their data through robots. txt files, fearing the potential impact of generative AI on their businesses. This restriction poses challenges for AI companies that heavily rely on such data. The Data Provenance Initiative's report, titled "Consent in Crisis: The Rapid Decline of the AI Data Commons, " reveals that a significant portion of the data used to train AI models has been restricted in recent years.

This restriction not only affects the quality and freshness of the data but also creates a gap between models that respect robots. txt and those that disregard it. Some potential solutions proposed include licensing data directly from organizations, utilizing synthetic data, or finding ways to extract hidden data, such as that locked away in PDFs. The report emphasizes the need for industry standardization and improved mechanisms for expressing data usage preferences that balance the interests of various stakeholders.

News source

Brief news summary

In a new report by the Data Provenance Initiative, it is revealed that many organizations are restricting access to data sets used to train generative AI models. This has significant implications for the future of AI companies and their ability to improve models. The report discusses how websites are using the robot exclusion protocol (robots.txt) to restrict web crawlers from accessing specific parts of their websites. This has led to a decline in the availability of high-quality data sets, as many news and academic websites are placing restrictions to protect their data from generative AI. The report also highlights the rise of synthetic data and the challenges and opportunities it presents. Overall, the report signals a crisis in obtaining consent for data usage and calls for new standards to be established to facilitate the expression of data preferences by website owners.

Business on autopilot

AI-powered Lead Generation in Social Media
and Search Engines

Let AI take control and automatically generate leads for you!

I'm your Content Manager, ready to handle your first test assignment

Language

Learn how AI can help your business.
Let’s talk!

Hot news

July 11, 2025, 6:22 a.m.

Elon Musk Announces Grok AI Integration in Tesla …

Elon Musk, the visionary CEO of Tesla and AI startup xAI, has announced a major advancement in integrating artificial intelligence into electric vehicles.

July 11, 2025, 6:21 a.m.

Japan’s Gates to tokenize $75M in Tokyo real esta…

Japanese real estate investment firm Gates Inc.

July 10, 2025, 2:38 p.m.

xAI Debuts Grok 4, 'Smartest AI in the World'

On July 10, 2025, Elon Musk and xAI officially introduced their latest AI model, Grok 4, in a highly anticipated livestream event.

July 10, 2025, 2:25 p.m.

Bitcoin Reaches New All-Time High Amid Regulatory…

Bitcoin has recently surged to a new all-time high of $112,676, marking a significant milestone that reflects strong, sustained bullish sentiment among investors and traders.

July 10, 2025, 10:30 a.m.

Microsoft Racks Up Over $500 Million in AI Saving…

According to a recent Bloomberg News report, Microsoft has effectively utilized artificial intelligence (AI) to achieve substantial cost savings and enhanced productivity across multiple business areas.

July 10, 2025, 10:09 a.m.

Monad Acquires Portal Labs to Expand Stablecoin P…

Monad Acquires Portal Labs to Enhance Stablecoin Payments on High-Speed Blockchain Following the acquisition, Raj Parekh, co-founder of Portal and former Visa crypto director, will head Monad’s stablecoin strategy

July 10, 2025, 6:18 a.m.

SEC's 'crypto mom' says tokenized securities are …

Hester Peirce, a Republican commissioner at the U.S. Securities and Exchange Commission (SEC) and a prominent advocate for the cryptocurrency sector, recently emphasized the vital importance of regulatory compliance for tokenized securities.

All news

Launch Your AI-Powered Business and get clients!

Challenges of Data Access for Generative AI Models Highlighted in New Report

News source

Brief news summary

AI-powered Lead Generation in Social Media
and Search Engines

I'm your Content Manager, ready to handle your first test assignment

Content Maker

Last news

Elon Musk Introduces Grok AI Integration in Tesla Cars Amid Controversy

Gates Inc. Launches $75M Real Estate Tokenization in Tokyo on Oasys Blockchain

Elon Musk and xAI Unveil Grok 4: The Smartest AI Model with Advanced Capabilities

The Best for your Business

Learn how AI can help your business.
Let’s talk!

Hot news

Elon Musk Announces Grok AI Integration in Tesla …

Japan’s Gates to tokenize $75M in Tokyo real esta…

xAI Debuts Grok 4, 'Smartest AI in the World'

Bitcoin Reaches New All-Time High Amid Regulatory…

Microsoft Racks Up Over $500 Million in AI Saving…

Monad Acquires Portal Labs to Expand Stablecoin P…

SEC's 'crypto mom' says tokenized securities are …

Sales

Marketing

For Expo

Launch Your AI-Powered Business and get clients!

Challenges of Data Access for Generative AI Models Highlighted in New Report

News source

Brief news summary

AI-powered Lead Generation in Social Media and Search Engines

I'm your Content Manager, ready to handle your first test assignment

Content Maker

Last news

Elon Musk Introduces Grok AI Integration in Tesla Cars Amid Controversy

Gates Inc. Launches $75M Real Estate Tokenization in Tokyo on Oasys Blockchain

Elon Musk and xAI Unveil Grok 4: The Smartest AI Model with Advanced Capabilities

The Best for your Business

Learn how AI can help your business. Let’s talk!

Hot news

Elon Musk Announces Grok AI Integration in Tesla …

Japan’s Gates to tokenize $75M in Tokyo real esta…

xAI Debuts Grok 4, 'Smartest AI in the World'

Bitcoin Reaches New All-Time High Amid Regulatory…

Microsoft Racks Up Over $500 Million in AI Saving…

Monad Acquires Portal Labs to Expand Stablecoin P…

SEC's 'crypto mom' says tokenized securities are …

Your News is ready

Your article is ready

Generating video takes longer than text.

Join our community of experts

Reasons why you should be part of the experts community

Welcome to Neuron Expert!

Check your email

Launch Your AI-Powered Business

Auto-Filling SEO Website as a Gift

AI Marketing Across All Social Media

AI Sales Manager + CRM

Support

Content Maker

Topic

Specify the topic (Optional)

Link (Optional)

Learn how to craft press releases, create unique social media posts, write SEO-optimized articles for websites, and produce videos, all from a single source

AI-powered Lead Generation in Social Media
and Search Engines

Learn how AI can help your business.
Let’s talk!