lang icon En
July 28, 2023, 6:04 a.m.
937

None

Brief news summary

None

Gain access to your preferred topics through a personalized feed, even while on the go. Download our app!In a recent study conducted by researchers at Carnegie Mellon University and the Center for A. I. Safety, potential vulnerabilities in major AI-powered chatbots from OpenAI, Google, and Anthropic have been identified. It was discovered that despite extensive moderation efforts by tech companies, guardrails within large language models like ChatGPT, Bard, and Anthropic's Claude can be overcome. These guardrails were initially implemented to prevent malicious usage of the chatbots, such as providing instructions for creating harmful devices or generating hate speech.

The researchers showcased how automated adversarial attacks, achieved by appending additional characters to user queries, can bypass safety measures and cause chatbots to produce harmful content, misinformation, or hate speech. Notably, the researchers developed automated methods for these attacks, enabling the generation of an extensive range of similar tactics. Upon discovering these vulnerabilities, the researchers promptly disclosed their findings to Google, Anthropic, and OpenAI. Google has assured that important guardrails have been integrated into Bard, with ongoing efforts to further enhance its effectiveness based on research recommendations. Anthropic acknowledged jailbreaking as an active area of investigation and expressed the need for further improvements in base model guardrails, along with potential additional layers of defense. OpenAI has yet to comment. While early attempts to subvert system guidelines, such as prompting chatbots to bypass content moderation, were swiftly addressed by tech companies, the researchers raised concerns about the companies' ability to completely eradicate such behavior. These findings prompt questioning of the moderation practices surrounding AI systems, as well as the safety implications associated with releasing powerful open-source language models to the public.


Watch video about

None

Try our premium solution and start getting clients — at no cost to you

Content creator image

I'm your Content Creator.
Let’s make a post or video and publish it on any social media — ready?

Language

Hot news

April 1, 2026, 6:24 a.m.

SoundHound AI Launches Sales Assist Agent at MWC …

SoundHound AI, Inc., a global leader in voice and conversational artificial intelligence, has introduced Sales Assist, a voice-powered AI agent tailored for retail environments.

April 1, 2026, 6:21 a.m.

X Suspends Revenue for Unlabeled AI Posts on Arme…

In March 2026, X, a leading social media platform, announced it will suspend creators from its revenue-sharing program if they post unlabeled AI-generated content related to armed conflict.

April 1, 2026, 6:15 a.m.

Video Captures Walmart's Controversial AI Pricing…

A recent incident at Walmart, captured by a customer on video, has ignited wide debate about the retailer’s new AI-powered pricing system.

March 31, 2026, 2:35 p.m.

SMM Deal Finder Launches AI-Powered Platform for …

SMM Deal Finder has launched an innovative AI-powered platform aimed at transforming how social media marketers acquire customers.

March 31, 2026, 2:24 p.m.

AI Discovery Has Rewritten the Rules of Online Sh…

In the AI discovery era, visibility means being surfaced by generative engines that dictate attention, rather than simply being seen.

March 31, 2026, 2:17 p.m.

MarketsandMarkets Releases Executive Playbook for…

DELRAY BEACH, Fla., June 23, 2025 /PRNewswire/ -- MarketsandMarkets has released its latest executive whitepaper, "The Future of AI-Powered Sales: A Strategic Guide for Modern GTM Leaders," which offers a forward-looking framework enabling sales and commercial leaders to bridge the gap between strategy and execution through AI-driven intelligence.

March 31, 2026, 2:13 p.m.

OpenAI Shuts Down Sora Amid Deepfake Concerns

OpenAI has announced the shutdown of its social media app, Sora, which had attracted considerable attention and popularity since its launch last fall.

All news

AI Company

Launch your AI-powered team to automate Marketing, Sales & Growth

AI Company welcome image

and get clients on autopilot — from social media and search engines. No ads needed

Begin getting your first leads today