lang icon En
Sept. 25, 2024, 7:19 a.m.
2905

Study Reveals AI Chatbots Frequently Provide Incorrect Answers

Brief news summary

A study published in *Nature* by José Hernández-Orallo from the Valencian Research Institute for Artificial Intelligence explores the performance of advanced AI chatbots, including OpenAI's GPT, Meta's LLaMA, and BigScience's BLOOM. The research highlights a significant issue: over 60% of the analyzed responses were found to be incorrect or evasive, raising concerns about users' understanding of AI capabilities. The study involved an extensive analysis of thousands of prompts and revealed that models like GPT-4 often attempt to answer complex questions, increasing the likelihood of errors and leading users to mistakenly trust these inaccuracies. Hernández-Orallo recommends that AI developers prioritize accuracy in simpler queries and train models to avoid responding to overly difficult questions. Although some AI models do express uncertainty with statements like "I don't know," they frequently provide confidence in incorrect answers, which may cause users to overvalue the reliability of AI systems.

A study on advanced versions of three popular AI chatbots reveals that they tend to generate incorrect answers more frequently than they admit when they don't know something. The research, led by José Hernández-Orallo from the Valencian Research Institute for Artificial Intelligence, analyzed the errors of large language models (LLMs), noting that while accuracy improves with model size and refinement, the rate of incorrect responses has also risen. Instead of opting to decline difficult questions, these models often provide answers, leading to an increase in misleading responses. Hernández-Orallo observes that chatbots are becoming more adept at mimicking knowledge without genuine understanding, a phenomenon described as "ultracrepidarianism. " This can lead to users overestimating chatbot abilities, which poses risks. The team examined models like OpenAI's GPT, Meta's LLaMA, and the open-source BLOOM, assessing their accuracy across various question types.

They found that even with improved models, over 60% of their responses were incorrect or unqualified. Moreover, human volunteers often miscategorized incorrect answers as correct, demonstrating a lack of ability to supervise the models effectively. To enhance user understanding, Hernández-Orallo suggests that developers should improve performance on simple questions and train chatbots to refrain from answering difficult ones. This would help users identify where AI is reliable and where it isn't. Although some chatbots can acknowledge their lack of knowledge, the push for models to tackle difficult questions remains prominent, especially for those marketed as general-purpose.


Watch video about

Study Reveals AI Chatbots Frequently Provide Incorrect Answers

Try our premium solution and start getting clients — at no cost to you

Content creator image

I'm your Content Creator.
Let’s make a post or video and publish it on any social media — ready?

Language

Hot news

March 21, 2026, 2:33 p.m.

How AI-powered personalisation is boosting wine s…

Earlier this month, a webinar examined how AI-powered personalization is reducing uncertainty in online wine purchases, benefiting the drinks trade.

March 21, 2026, 2:28 p.m.

Is the New LG C4 OLED’s AI Picture Pro Engine Act…

LG’s 2024 C4 OLED series introduces a notable advancement in display technology with its new 'AI Picture Pro' engine, designed to tackle the common OLED concern of burn-in.

March 21, 2026, 2:24 p.m.

10 Major AI Companies You Should Know

Nvidia CEO Jensen Huang recently highlighted the immense scale of current technological advancements, stating that the ongoing expansion of artificial intelligence infrastructure constitutes the largest infrastructure buildout in human history.

March 21, 2026, 2:23 p.m.

AI Video Compression Techniques Improve Streaming…

In an era of rapidly increasing digital content consumption, streaming services are leveraging artificial intelligence (AI) to enhance video delivery, with AI-driven video compression being a notable breakthrough poised to transform online media experiences.

March 21, 2026, 2:15 p.m.

Meta's Bold Move: Sweeping Layoffs and AI Ambitio…

Meta, the parent company of Facebook, Instagram, and WhatsApp, is reportedly preparing to carry out significant layoffs affecting over 20% of its global workforce.

March 21, 2026, 10:28 a.m.

The Hypocrisy at the Heart of the AI Industry

In April 2024, former Google CEO and AI advocate Eric Schmidt delivered a private lecture at Stanford, telling aspiring Silicon Valley entrepreneurs to be ready to cross ethical lines.

March 21, 2026, 10:22 a.m.

PREXA365 Launches Rental AI Agents at ARA 2026 to…

PREXA365, a leading rental management software, proudly announces the launch of its Rental AI Agents at the American Rental Association (ARA) Show 2026.

All news

AI Company

Launch your AI-powered team to automate Marketing, Sales & Growth

AI Company welcome image

and get clients on autopilot — from social media and search engines. No ads needed

Begin getting your first leads today