AI Agents Still Can’t Stop Shootings Quickly, Researchers Warn



In short

  • Researchers found the AI ​​agents controlled by GPT-5 and Gemini could not resist the injection quickly.
  • Direct attacks were successful more than 79% of the time, while subtle attacks linked to a website often compromised the behavior of agents.
  • These findings show that rapid injection remains a major safety concern as AI agents become more popular.

As developers rush to deploy AI assistants that can browse the web, conduct research, shop online, and do business. crypto currency independently, a new study shows that these systems remain at high risk of rapid injection.

In the new learning published on Thursday, researchers from Nanyang Technological University, ST Engineering, IBM Research, and the University of Illinois Urbana-Champaign found that none of the AI ​​agents they tested rejected injections quickly.

“Existing security indicators are based on assumptions about the attack, focusing on the probability of injection and focusing on the spatial distribution of harm,” the researchers wrote. “However, on the contrary, the risk of acute injection depends on the victims: a single application can produce the same effect on different victims, and the same method can show different effectiveness depending on the target.”

Give the injection quickly occurs when attackers insert hidden instructions into what a I have an assistant encounter, causing it to follow the attacker’s instructions instead of using it. To address the gaps in existing evaluations of AI agents, the researchers created StakeBench, a benchmark that measures how quickly AI agents respond to injections in online environments.

“We now use StakeBench to identify how this risk is amplified or suppressed, focusing on (Indirect Prompt Injection) as a key link,” the researchers wrote. “StakeBench examines these three factors: the distance between the injection and the user’s original intention, the connection to the surrounding environment, and the place where the agent is killed at which the signal begins to show what was born.”

The team performed 3,168 tests using NanoBrowser and BrowserUse with GPT-5 and Gemini 2.5-Flash. Researchers found that direct injection succeeded 79% of the time tested, and indirect attacks succeeded from 41.67% to 68.16%.

The research comes as injection attacks are on the rise and AI assistants are becoming increasingly popular.

In February, Microsoft researchers he warned that hidden instructions embedded in AI short links can influence chatbot behavior. In April, Google documents quickly injects hidden content into websites that attempt to trick AI agents into destroying reputations or sending money. Recently, Microsoft to be revealed an injection bug in Anthropic’s Claude Code GitHub Action that could have exposed user credentials.

The study also identified what the researchers called “stealthy parasitism,” where an AI assistant completes the user’s task while furthering the attacker’s goal. For example, parasitism caused by rapid injection can distort product logic, directing users to another product without any obvious signs that the system is malfunctioning.

“These results show that the security of fast recording in websites that can be deployed is not a risk factor of the spine model but the distribution of harm whose awareness is determined by the stakeholders, the semantic connection between the purpose of the recording and the activity of the user, and the structure in which the spine is placed,” he wrote.

Daily Debrief A letter

Start each day with top stories right here, including originals, podcasts, videos and more.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *