HIGH March 9, 2026 7 min read

What Is Cross-Agent Propagation? How AI Agent Attacks Spread Across Your Stack

SkillShield Research Team

Security Research

Cross-agent propagation is what happens when an attack originating in one AI agent — through a compromised skill, a poisoned memory entry, or a manipulated tool result — automatically spreads to other agents in the same system. No human interaction required. No visible prompt. The infection travels through the channels your agents use to coordinate.

It is the closest thing to a worm that exists in modern AI deployments, and most AI security tooling doesn't test for it.


How does cross-agent propagation work?

In a multi-agent setup, agents communicate constantly. An orchestrator delegates tasks to subagents. Subagents return results that feed into the next step. Memory is shared or passed between sessions. Tools write outputs that get read by downstream agents.

Every one of those hand-off points is a propagation vector.

Here's the basic attack pattern documented in the Agents of Chaos study:

Step 1 — Initial compromise
A malicious skill installed by Agent A returns a result containing a hidden payload. The payload might be instructions, a false identity claim, or data structured to trigger a specific downstream behavior. Agent A processes this result and adds something to shared memory or passes output to the next agent.

Step 2 — Lateral propagation
Agent B reads from shared memory, Agent A's output, or a shared tool result. It encounters the payload and executes it — because it came from a trusted source (Agent A) inside the trusted environment (the agent's own context window or memory). Agent B now generates output containing the same payload, forwarded to Agent C.

Step 3 — Compounding damage
By the time the payload reaches Agent C or Agent D, it may have accumulated additional context, permissions, or capabilities with each hop. The original attack surface — one compromised skill — has now infected the entire pipeline.

The researchers at Northeastern, Harvard, Stanford, MIT, and CMU documented this occurring in live deployments on OpenClaw infrastructure over a two-week red team exercise. They observed that cross-agent propagation was most dangerous when combined with identity spoofing: each compromised agent relayed the payload while appearing to be the trusted source that came before it.


Why is this worse than prompt injection?

Prompt injection attacks the chat interface. An attacker crafts input that overrides the agent's system prompt or manipulates its output for a single interaction. It's serious — but it's bounded by one conversation turn.

Cross-agent propagation has no such bound.

Prompt injection Cross-agent propagation
Entry point User message / web content Compromised skill, memory, tool result
Blast radius One agent, one turn Entire multi-agent pipeline
Persistence Single session Can persist across sessions via shared memory
Detectability Visible in the prompt Invisible — originates in infrastructure
Defender response Input validation / system prompt hardening Supply-chain scanning before execution

The critical difference: prompt injection is something you can partially defend against with a better system prompt. Cross-agent propagation originates in the tools your agents trust. No system prompt instruction fixes a compromised skill.


What does a real propagation chain look like?

The Agents of Chaos researchers documented several variants. A simplified composite:

  1. Research agent is assigned to gather market data. It calls a skill that fetches external content. That skill — either compromised at the source or tampered with in transit — returns a document with embedded propagation payload alongside the legitimate data.
  2. Research agent adds a summary to the shared memory store. The summary includes the payload, encoded as what appears to be a legitimate data field.
  3. Writing agent reads from the memory store to generate a report. It encounters the payload and — depending on its configuration — executes instructions it believes came from the orchestrator.
  4. Orchestrator receives the writing agent's output. The payload has now made one additional hop, with the orchestrator's authority attached.

In the worst-observed case, the attack reached a point where an agent with external API access was executing commands on behalf of the payload without any human visible in the chain.


The detection problem

Most runtime security tools test the visible attack surface: what happens when you send adversarial prompts to a running agent's endpoint.

Cross-agent propagation doesn't arrive as an adversarial prompt. It arrives as:

  • A tool result from a skill you installed and trusted
  • A shared memory entry written by a previous agent
  • An inter-agent message that passed through legitimate intermediaries

By the time the payload reaches the agent that acts on it, it looks like internal, trusted data. There's nothing for a runtime scanner to flag. For a full breakdown of how supply-chain inspection compares to runtime testing, see SkillShield vs AgentSeal: Two Layers of AI Agent Security.


What SkillShield scans for

SkillShield operates at the supply-chain layer — before skills execute. For cross-agent propagation specifically:

Propagation payload detection
SkillShield flags skills that embed instruction-like patterns in structured data outputs — payloads that could survive agent-to-agent hand-offs and trigger behavior downstream.

Memory write surface mapping
Skills that access shared memory paths, inter-agent channels, or session state objects are flagged as elevated-risk attack surface, regardless of current behavior. This surfaces the propagation vectors before they're exploited.

Change-delta scanning
A skill that was clean last week can become malicious today. SkillShield tracks skill versions and alerts when a previously safe package changes in ways consistent with propagation payload insertion.

Supply-chain integrity verification
Cross-agent propagation often starts with a compromised package registry entry or a silently updated MCP server. SkillShield verifies the integrity of installed skills against known-good baselines.

Prompt hardening doesn't prevent this class of attack. The propagation originates upstream of every system prompt in your stack. That's the layer SkillShield was built to secure.


Quick answers

Does cross-agent propagation require a sophisticated attacker?
No. A single compromised skill package — the kind that appears legitimate in a registry listing — is sufficient as an entry point. The propagation happens automatically via normal agent coordination.

Is this theoretical or observed in the wild?
The Agents of Chaos researchers documented it in a live deployment over two weeks. It is not theoretical.

Does this affect single-agent setups?
Less so — single-agent setups have fewer propagation hops. But any single agent that reads from external sources (tools, web content, memory files shared with other systems) has some exposure to the underlying mechanism.

Can I prevent this with a better system prompt?
No. System prompts address how an agent interprets instructions it receives. Cross-agent propagation delivers payloads that look like trusted internal data, not external instructions. The defense is at the supply-chain layer, not the prompt layer.


Related reading

Scan your skills before the worm runs

SkillShield detects propagation payload patterns, memory write surfaces, and supply-chain tampering at the skill layer — before your agents ever execute the code.

Get early access