11 minute read

“Agent A told Agent B to transfer the funds. Nobody verified that Agent A was Agent A.”

TL;DR

Multi-agent systems have a trust problem that OAuth and SAML weren’t designed for. Agents delegate laterally, create sub-agents, and share tools across boundaries that don’t map to organizational hierarchies. Palo Alto Unit 42 demonstrated agent session smuggling where a single compromised agent corrupted 87% of downstream decisions within four hours. MCP has 13,000+ servers on GitHub with zero central verification. Google’s A2A protocol has no enforced token expiration. The identity infrastructure for agent-to-agent trust doesn’t exist yet. For how to build cryptographic identity binding for individual agents, see Cryptographic capability binding.


Unlabeled ethernet cables connecting network switches with no authentication indicators

Why can’t agents verify each other’s identity?

Because the identity protocols we have were built for a different trust model.

OAuth 2.0 assumes a human initiates an authorization flow, grants consent to a specific application, and that application acts on behalf of that human. The trust chain is hierarchical: user trusts application, application acts within scoped permissions. When Agent A delegates a task to Agent B, who is the “user” granting consent? If Agent B creates Agent C, does Agent C inherit Agent A’s permissions? OAuth has no answer for lateral delegation between autonomous entities.

SAML assumes organizational identity federation between known entities with pre-established trust relationships. Agents spin up dynamically, run across providers, and interact with agents from other organizations through protocols like MCP and A2A. There’s no pre-established federation agreement between an agent running on your infrastructure and an MCP server someone published to GitHub last week.

The fundamental gap: current identity management protocols treat authentication as a human-initiated, hierarchical process. Agent systems are peer-to-peer, lateral, and autonomous. An agent is a process running on infrastructure, not a persistent entity with a verifiable identity. When Agent A sends a request to Agent B, there’s no standard mechanism for Agent B to verify:

  • The request came from an authorized Agent A (not an attacker impersonating it)
  • Agent A had the authority to delegate this specific task
  • The task parameters haven’t been tampered with in transit
  • Agent A’s permissions haven’t been revoked since the delegation was issued

70% of enterprise AI deployments now involve multi-agent architectures (industry analysis, 2025). Each deployment inherits this unsolved identity problem.


What is agent session smuggling?

Palo Alto Networks’ Unit 42 documented agent session smuggling as a distinct attack class. The attack exploits a simple fact: agents in a multi-agent system implicitly trust messages from other agents in the same conversation stream.

The mechanism: an attacker compromises one agent in a multi-agent workflow (through prompt injection, tool abuse, or supply chain poisoning). The compromised agent then injects manipulated context into the shared conversation flow. Downstream agents process this context as legitimate because it arrived through the expected channel from a peer agent.

Unit 42’s testing found that a single compromised agent corrupted 87% of downstream decision-making within four hours. The corruption cascades because each downstream agent incorporates the tainted context into its own reasoning and passes its conclusions to the next agent in the chain.

This is different from traditional message tampering. The attacker doesn’t need to intercept and modify messages in transit. They compromise one endpoint and let the trust model do the rest. The agent-to-agent communication channel IS the attack vector.

A concrete example: in a multi-agent financial workflow, Agent A (market research) feeds data to Agent B (risk assessment), which feeds to Agent C (trade execution). An attacker compromises Agent A through a poisoned data source. Agent A passes manipulated market data that looks legitimate. Agent B’s risk assessment reflects the manipulated data. Agent C executes trades based on Agent B’s assessment. No authentication failure occurred. Every agent did its job. The outcome is still attacker-controlled.


What are the security gaps in MCP?

MCP (Model Context Protocol), developed by Anthropic, lets AI agents connect to external tools and data sources through a standardized interface. It’s become the de facto standard for agent tool integration. It also has four critical security gaps that matter for inter-agent trust.

No per-user access control. MCP tools are available to anyone who connects to the server. There’s no built-in mechanism to scope which users or agents can invoke which tools. If your MCP server exposes a database query tool, every connected agent can use it, regardless of whether their task requires database access.

Plaintext credential storage. MCP server configurations often store API keys and database credentials in plaintext configuration files. A compromised agent with filesystem access can read these credentials and use them outside the MCP framework.

Missing exfiltration controls. MCP has no built-in mechanism to restrict what data flows through tool responses. A tool that returns search results could return any data the tool has access to, including data the requesting agent shouldn’t see. There’s no output filtering or data classification at the protocol level.

No audit logging. MCP doesn’t log tool invocations by default. When something goes wrong, there’s no protocol-level audit trail showing which agent called which tool with what parameters and what was returned.

There are over 13,000 MCP servers on GitHub (as of early 2026) with zero central verification. Any MCP server can claim any capability. There’s no signing, no attestation, no registry of verified servers. An agent connecting to an MCP server has no cryptographic way to verify that the server is what it claims to be.

For how tool-sharing through MCP creates cross-agent privilege escalation paths, see The privilege escalation kill chain.


What about Google’s A2A protocol?

Google’s Agent-to-Agent (A2A) protocol was designed to let agents from different frameworks and vendors communicate. It addresses some of the interoperability problems but introduces its own trust gaps.

No enforced token expiration. A2A uses bearer tokens for authentication, but doesn’t mandate expiration policies. A token issued for a specific delegation could be replayed hours or days later. Without enforced expiration, stolen or leaked tokens remain valid indefinitely.

Agent Card spoofing. A2A uses “Agent Cards” (JSON metadata describing an agent’s capabilities) for discovery. The protocol doesn’t include a mechanism to verify that an Agent Card accurately represents the agent’s actual identity and capabilities. An attacker can publish a spoofed Agent Card claiming to be a trusted agent.

Coarse-grained scopes. A2A’s permission model uses broad scopes that don’t map well to fine-grained task-level authorization. An agent granted access to “financial data” might access both account summaries (needed for its task) and transaction details (not needed). Coarse scopes enable lateral movement within the authorized domain.

Missing consent for data sharing. When Agent A shares data with Agent B through A2A, there’s no protocol-level mechanism for the data owner to consent to the sharing. The agents negotiate between themselves without external oversight.

These aren’t implementation bugs. They’re design gaps in a protocol trying to balance interoperability with security. The A2A specification acknowledges many of these tradeoffs, but the current priority has been getting agents talking to each other, not securing the conversation.


What would secure agent-to-agent communication look like?

The building blocks exist. They’re just not assembled into a coherent agent identity infrastructure yet.

Cryptographic identity per agent. Each agent gets a unique identity bound to a cryptographic key pair. Decentralized Identifiers (DIDs) with verifiable credentials are the most promising approach: they don’t require a central authority, they support delegation, and they can encode capability constraints. The agent proves its identity by signing requests with its private key.

Signed requests on every interaction. W3C HTTP Message Signatures provide a standard for signing HTTP requests so the receiver can verify the sender’s identity and confirm the message wasn’t tampered with. Every agent-to-agent request should carry a signature. Every receiving agent should verify it.

Sender-constrained delegation tokens. DPoP (Demonstration of Proof-of-Possession) binds tokens to the sender’s key pair, preventing stolen tokens from being used by another entity. Combined with short expiration (minutes, not hours) and per-hop validation (each agent in the chain validates before forwarding), this eliminates most replay and escalation attacks.

Mutual TLS for transport. mTLS ensures both sides of the connection are authenticated at the transport layer. The agent connecting to an MCP server or A2A endpoint proves its identity through its certificate. The server proves its identity through its certificate. This prevents impersonation at the connection level.

A live queryable registry. Agents need to check peer identity in real time, not against a static configuration. A registry that agents can query to verify: “Is this agent authorized to make this request with these parameters right now?” Prove’s identity verification platform ($1.7 trillion in commerce transactions) demonstrates that real-time identity verification at scale is technically feasible.

graph LR
    A[Agent A] -->|Signed Request<br/>+ DPoP Token| B[Agent B]
    B -->|Verify Signature| R[Agent Registry]
    R -->|Identity Confirmed<br/>+ Scope Validated| B
    B -->|Signed Response| A

    A -.->|mTLS| B
    B -.->|mTLS| C[Agent C]

    subgraph "Missing Today"
        R
        style R fill:#fce4ec
    end

None of this exists as an integrated standard for AI agents. The components (DIDs, DPoP, HTTP Signatures, mTLS) are proven in other contexts. Assembling them into an agent-native identity layer is the unsolved engineering problem.


Key takeaways

  • OAuth and SAML assume human-initiated, hierarchical trust. Agent delegation is peer-to-peer, lateral, and autonomous. The identity protocols don’t fit.
  • Agent session smuggling (Unit 42) demonstrates 87% downstream corruption from a single compromised agent within four hours
  • MCP has four critical security gaps: no per-user access control, plaintext credentials, missing exfiltration controls, no audit logging. 13,000+ servers with zero verification.
  • Google’s A2A protocol lacks enforced token expiration, Agent Card verification, fine-grained scopes, and data-sharing consent
  • The building blocks for secure agent identity exist (DIDs, DPoP, HTTP Message Signatures, mTLS) but aren’t assembled into an agent-native standard
  • 70% of enterprise AI deployments now involve multi-agent architectures, each inheriting this unsolved trust problem

FAQ

Why can’t agents verify each other’s identity?

Current identity protocols (OAuth, SAML) assume human-initiated, hierarchical trust. Agent systems delegate laterally between autonomous entities. There’s no standard mechanism for Agent B to verify that a request from Agent A is authentic, authorized, within scope, and untampered. The identity management infrastructure for autonomous peer-to-peer agent communication doesn’t exist yet.

What is agent session smuggling?

Documented by Palo Alto Unit 42, agent session smuggling injects covert commands into agent-to-agent conversation streams. Agents implicitly trust messages from peers in the same workflow. A single compromised agent corrupted 87% of downstream decisions within four hours by injecting manipulated context that cascades through the chain.

What are the security gaps in MCP?

Four critical gaps: no per-user access control (all connected agents can use all tools), plaintext credential storage, no exfiltration controls on tool outputs, and no audit logging. Over 13,000 MCP servers exist on GitHub with zero central verification, signing, or attestation.

What would a verified agent identity registry need?

PKI-based identity using DIDs and verifiable credentials, a live queryable registry for real-time verification, HTTP Message Signature validation on all inter-agent requests, sender-constrained delegation tokens (DPoP) with short expiration and per-hop validation, mutual TLS for transport authentication, and revocation support for compromised identities.

Want to work together?

I take on projects, advisory roles, and fractional CTO engagements in AI/ML. I also help businesses go AI-native with agentic workflows and agent orchestration.

Get in touch