News
>
Limitations of Large Language Models: A Study on AI's Incomplete World Understanding

Nov. 4, 2024, 8 p.m.

Limitations of Large Language Models: A Study on AI's Incomplete World Understanding

Large language models (LLMs) are capable of impressive feats, such as writing poetry and generating functional computer programs, primarily by predicting subsequent words in text. While these capabilities suggest that LLMs might learn general truths about the world, a new study challenges that assumption. Researchers found that a popular type of generative AI could provide highly accurate driving directions in New York City without actually having an accurate internal map of the area. When they altered the environment by closing streets and adding detours, the AI's performance significantly declined. The study indicated that the model's internal representation of New York City was flawed, featuring many nonexistent streets and illogical connections. This raises concerns that AI models can appear to perform well in specific contexts but may fail under slightly altered conditions. Senior author Ashesh Rambachan emphasized the importance of understanding whether LLMs grasp coherent models of the world if researchers aim to apply these tools in scientific fields. The research team included multiple collaborators and will present their findings at the Conference on Neural Information Processing Systems. The researchers concentrated on transformer models, the backbone of LLMs like GPT-4, which are trained on extensive language data to predict tokens.

The team developed two new metrics—sequence distinction and sequence compression—to evaluate whether a transformer had formed a sound world model. They tested these metrics on two deterministic problems: navigating New York City streets and playing Othello. Contrary to expectations, transformers trained on random sequences performed better in developing coherent world models, likely due to exposure to a broader range of possible moves during training. Despite generating accurate outputs, only one model showed coherence in Othello, and none effectively navigated New York under the new conditions. When the researchers simulated detours, the AI's accuracy dropped dramatically from nearly 100 percent to 67 percent with just a minor change. The generated maps depicted an unrealistic and overly complex version of New York City, underscoring that transformers can excel at tasks without understanding the underlying rules. Rambachan urges caution against assuming these models understand the world simply because they achieve impressive results. Future research will explore diverse problems with gradually understood rules and apply the new evaluation metrics to real-world scientific challenges. This study is supported by various grants, including those from Harvard and the National Science Foundation.

News source

Brief news summary

A recent study highlights the limitations of large language models (LLMs), like transformers, despite their success in certain tasks such as navigation. For instance, while these models can navigate complex urban environments like New York City, they struggle with unexpected challenges, such as road closures, revealing shortcomings in their mapping abilities. This raises significant concerns about the reliability of generative AI in dynamic situations. The researchers introduced two metrics—"sequence distinction" and "sequence compression"—to assess LLM navigation performance in both urban contexts and the game Othello. Intriguingly, AIs employing random strategies outperformed those following optimal moves in accurately mapping their environments. This suggests that while transformers may excel in isolated tasks, they often lack a deep understanding of vital concepts. The study calls for a reassessment of expectations regarding LLM capabilities and underscores the need for further research to enhance their coherence in world modeling, particularly in scientific contexts.

Watch video about

Limitations of Large Language Models: A Study on AI's Incomplete World Understanding

Try our premium solution and start getting clients — at no cost to you

I'm your Content Creator.
Let’s make a post or video and publish it on any social media — ready?

Language

Limitations of Large Language Models: A Study on AI's Incomplete World Understanding

News source

Brief news summary

Watch video about

Try our premium solution and start getting clients — at no cost to you

I'm your Content Creator.
Let’s make a post or video and publish it on any social media — ready?

Content Maker

Last news

Google AI Mode Launches: Transforming Search with AI-Generated Content

DEYA SMM: AI-Powered Social Media Management Studio Revolutionizing Digital Marketing

Anthropic Secures Multibillion-Dollar Google TPU Deal to Boost AI Assistant Claude

The Best for your Business

Hot news

Google AI Mode

DEYA SMM: AI-Powered Social Media Management Stud…

Anthropic Finalizes Multibillion-Dollar Deal with…

'SCARY': Experts warn of AI video use after man t…

DocketAI Named a Cool Vendor in the 2024 Gartner®…

AI Disruption Reshapes Advertising Industry and S…

Anthropic Signs Deal with Google Cloud to Expand …

AI Company

Sales

Marketing

Limitations of Large Language Models: A Study on AI's Incomplete World Understanding

News source

Brief news summary

Watch video about

Try our premium solution and start getting clients — at no cost to you

I'm your Content Creator. Let’s make a post or video and publish it on any social media — ready?

Content Maker

Last news

Google AI Mode Launches: Transforming Search with AI-Generated Content

DEYA SMM: AI-Powered Social Media Management Studio Revolutionizing Digital Marketing

Anthropic Secures Multibillion-Dollar Google TPU Deal to Boost AI Assistant Claude

The Best for your Business

Hot news

Google AI Mode

DEYA SMM: AI-Powered Social Media Management Stud…

Anthropic Finalizes Multibillion-Dollar Deal with…

'SCARY': Experts warn of AI video use after man t…

DocketAI Named a Cool Vendor in the 2024 Gartner®…

AI Disruption Reshapes Advertising Industry and S…

Anthropic Signs Deal with Google Cloud to Expand …

AI Company

Your News is ready

Your article is ready

Generating video takes longer than text.

Join our community of experts

Reasons why you should be part of the experts community

Welcome to Neuron Expert!

Check your email

Launch Your AI-Powered Business

AI Marketing Across All Social Media

AI Sales Manager + CRM

Support

Content Maker

Topic

Specify the topic (Optional)

Link (Optional)

Learn how to craft press releases, create unique social media posts, write SEO-optimized articles for websites, and produce videos, all from a single source

I'm your Content Creator.
Let’s make a post or video and publish it on any social media — ready?