What is corpus poisoning in RAG systems?

An attacker injects malicious documents into the RAG knowledge base. When a user query retrieves these documents, the malicious content enters the LLM's context and influences its response. Research shows that injecting just 5 poisoned texts into a corpus of millions achieves 90% attack success. The poisoned documents are optimized to score higher cosine similarity to target queries than legitimate documents, ensuring retrieval.

Can you reconstruct documents from their embeddings?

Partially. Embedding inversion research shows that semantic content can be recovered from embedding vectors. While exact verbatim reconstruction is difficult, the topic, key entities, and general content of the original document can often be inferred. This means that even encrypted or access-controlled source documents may leak information through their embeddings.

RAG security: the data pipeline you forgot to threat model

Q: Why don't most RAG implementations have access control?

RAG pipelines typically index all documents into a single vector store and retrieve based on semantic similarity alone. There's no permission check between retrieval and context assembly. If a junior employee queries the system, their query can retrieve executive-only documents, HR records, or financial data. Implementing document-level access control in RAG requires filtering at retrieval time, which adds complexity most implementations skip.

Q: What is OWASP LLM08 Vector and Embedding Weaknesses?

New in the 2025 OWASP LLM Top 10. It formally recognizes vector databases and embeddings as a distinct attack surface. Covers adversarial embeddings crafted to match arbitrary queries, poisoned documents disguised at the mathematical level, and vulnerabilities in the embedding-retrieval pipeline that exist independently of the LLM itself.

9 minute read

“We locked down the database. We hardened the API. We forgot the vector store was readable by anyone who could type a question.”

TL;DR

Injecting 5 malicious texts into a corpus of millions achieves 90% attack success (USENIX Security 2025). Poisoning 0.04% of the corpus hits 98.2%. Most RAG implementations skip document-level access control: any query retrieves any document. OWASP added Vector and Embedding Weaknesses (LLM08) to the 2025 Top 10. The RAG pipeline has six vulnerable stages, and most organizations have threat-modeled zero of them. For how poisoned RAG content enables indirect prompt injection, see Indirect prompt injection.

A server disk array with five red fault LEDs among rows of green healthy drive indicators

What is the RAG attack surface?

Six stages, each with distinct vulnerabilities.

1. Ingestion. Documents enter the pipeline from file uploads, web crawls, API imports, database syncs, or email inboxes. Malicious documents enter through the same channels as legitimate ones. If the system indexes documents from a shared drive, anyone with write access to that drive can inject poisoned content.

2. Chunking. Documents are split into chunks for embedding. Metadata is attached: source, date, author, access level. If metadata is user-controlled (filenames, document properties), it’s an injection vector. An attacker can embed instructions in document metadata that survive chunking and influence the LLM when the metadata is included in the retrieved context.

3. Embedding. Text chunks are converted to vector representations. Prompt Security research demonstrated that embeddings themselves can carry semantic payloads: hidden instructions that persist through the encoding process. The attack manipulates what the model retrieves and trusts without changing prompts, model weights, or API responses.

4. Storage. Vectors are stored in a vector database (Pinecone, Weaviate, ChromaDB, pgvector). Most vector databases have limited or no document-level access control. The vectors sit in a shared index where any query from any user can retrieve any vector.

5. Retrieval. The user’s query is embedded and compared against stored vectors by cosine similarity. The top-K most similar vectors are retrieved. Poisoned documents optimized for high similarity to target queries will consistently outrank legitimate documents in retrieval results.

6. Context assembly. Retrieved chunks are assembled into the LLM’s context alongside the user’s query and system prompt. This is where corpus poisoning becomes indirect prompt injection: the malicious content in the retrieved chunks enters the same context window as the user’s question, and the LLM processes both.

graph LR
    A[Ingestion<br/>Malicious docs enter] --> B[Chunking<br/>Metadata injection]
    B --> C[Embedding<br/>Semantic payloads]
    C --> D[Storage<br/>No access control]
    D --> E[Retrieval<br/>Poisoned docs rank higher]
    E --> F[Context Assembly<br/>Injection enters LLM]

    style A fill:#fce4ec
    style D fill:#fce4ec
    style F fill:#fce4ec

How effective is corpus poisoning?

More effective than most organizations expect, and it requires surprisingly few poisoned documents.

The PoisonedRAG attack (Zou et al., USENIX Security 2025) demonstrates the numbers:

Poisoning Level	Attack Success Rate
5 malicious texts in millions	90%
0.04% of corpus poisoned	98.2%
0.04% of corpus poisoned	74.6% system failure rate

The attack works through two conditions. The retrieval condition: poisoned documents are crafted with gradient-optimized content that scores higher cosine similarity to target queries than any legitimate document. When the user asks the target question, the poisoned document is guaranteed to be in the top-K results. The generation condition: the retrieved poison content is crafted to make the LLM produce the attacker’s desired answer instead of the correct one.

The practical implication: an attacker who can add a handful of documents to your knowledge base (through a shared drive, a document upload interface, an email to an indexed inbox, or a contribution to an indexed wiki) can control the LLM’s responses to specific queries with high reliability.

Knowledge graph RAG (KG-RAG) systems have their own poisoning vulnerabilities. Structured data in graph databases can be manipulated to inject false relationships and attributes that the LLM incorporates into its reasoning.

Why is access control the biggest gap?

Because most RAG implementations don’t have any.

The standard RAG architecture: index all documents into a single vector store, retrieve based on semantic similarity, pass results to the LLM. Nothing in this pipeline checks whether the user asking the question has permission to see the retrieved documents.

A junior employee asking “what’s our revenue forecast?” retrieves the same executive-only financial projections as the CFO. An external partner asking about product specs retrieves internal engineering documents. A customer asking about pricing retrieves the internal pricing strategy document that explains margin calculations.

This isn’t a theoretical concern. It’s the default behavior of every major vector database when used without explicit access control. Pinecone, Weaviate, ChromaDB, and pgvector all retrieve by similarity, not by permission.

Implementing access control in RAG requires:

Authentication. Verify who is making the query. Pass their identity through the RAG pipeline.

Authorization model. Define who can access which documents. Options include RBAC (simple but coarse), ABAC (flexible but complex), or ReBAC (relationship-based, good for organizational hierarchies). Tools like OpenFGA and SpiceDB provide decoupled authorization that separates permission logic from the RAG application.

Retrieval-time filtering. After the vector database returns top-K results by similarity, filter them through the authorization model. Only pass documents the user has permission to access into the LLM context. This is Fine-Grained Authorization (FGA) applied at the retrieval boundary.

The performance impact is real: adding a permission check to every retrieval increases latency. But the alternative is information leakage through every query.

What about embedding-level attacks?

Embeddings are not just compressed text. They carry semantic information that can be exploited in two directions.

Embedding poisoning crafts vectors that carry hidden semantic payloads. Prompt Security research showed that embeddings can encode instructions like “ignore previous instructions” or “respond as pirate” that persist through the encoding process. The attack manipulates what the model retrieves and trusts at the mathematical level, bypassing text-based defenses entirely. You can’t detect these payloads by reading the original document text because the payload exists in the vector representation, not the surface text.

Embedding inversion works in the other direction: recovering information about the source document from its embedding. While exact verbatim reconstruction is difficult, research shows that the topic, key entities, and general content can often be inferred from embedding vectors alone. This means that even if source documents are encrypted or access-controlled, their embeddings may leak information to anyone with access to the vector store.

OWASP recognized this with LLM08 (Vector and Embedding Weaknesses) in the 2025 Top 10, formally cataloging vector databases and embeddings as a distinct attack surface from the LLM model itself.

How do you secure the RAG pipeline?

Defense at each stage of the pipeline.

Ingestion: Validate and sanitize documents before indexing. Scan for prompt injection patterns in document content and metadata. Apply secret detection (TruffleHog, GitLeaks) to prevent credential indexing. Limit who can add documents to the knowledge base and audit additions.

Chunking: Strip or sanitize user-controlled metadata before it enters the embedding pipeline. Don’t include raw filenames or document properties in chunks without sanitization.

Embedding: Monitor for anomalous embedding distributions. Poisoned documents often cluster differently from legitimate content in the embedding space. Use anomalous retrieval detection to flag vectors that don’t match expected topic distributions.

Storage: Implement document-level access control in the vector store. Tag each vector with source document permissions. Use FGA frameworks (OpenFGA, SpiceDB) to enforce permissions at retrieval time.

Retrieval: Filter results through the authorization model before context assembly. Log all retrievals for audit. Set retrieval limits appropriate to the use case. Monitor for queries that consistently retrieve unusual document combinations.

Context assembly: Apply prompt injection detection to retrieved content before it enters the LLM context. Use spotlighting or datamarking to help the model distinguish retrieved data from instructions. For the full defense-in-depth architecture, see Defense-in-depth for LLM applications.

Key takeaways

5 poisoned documents in millions achieve 90% attack success. 0.04% corpus poisoning hits 98.2%. The bar for corpus poisoning is low.
Most RAG implementations have zero document-level access control. Any query retrieves any document.
OWASP LLM08 (2025) recognizes vector databases as a distinct attack surface
Embeddings carry exploitable semantic payloads and leak information about source documents through inversion
Six pipeline stages need security: ingestion, chunking, embedding, storage, retrieval, context assembly
Fine-Grained Authorization at the retrieval boundary is the critical control most implementations lack

FAQ

What is corpus poisoning?

Injecting malicious documents into the RAG knowledge base. Poisoned documents are optimized to score higher cosine similarity to target queries than legitimate ones, ensuring retrieval. Research shows 5 poisoned texts in millions of documents achieves 90% attack success. The malicious content enters the LLM context as trusted retrieval results.

Why don’t most RAG implementations have access control?

Standard RAG retrieves by similarity, not by permission. Vector databases return top-K results without checking whether the user has access to the source documents. Implementing access control requires identity propagation through the pipeline, document-level permission tagging, and retrieval-time filtering through an authorization model.

What is OWASP LLM08?

Vector and Embedding Weaknesses, new in 2025. Recognizes vector databases and embeddings as a distinct attack surface from the LLM model. Covers adversarial embeddings, poisoned documents at the mathematical level, and pipeline vulnerabilities independent of the language model itself.

Can documents be reconstructed from embeddings?

Partially. Exact verbatim reconstruction is difficult, but topic, key entities, and general content can be inferred from embedding vectors. Source documents may leak information through their embeddings even if the documents themselves are encrypted or access-controlled.

Want to work together?

I take on projects, advisory roles, and fractional CTO engagements in AI/ML. I also help businesses go AI-native with agentic workflows and agent orchestration.

Get in touch

RAG security: the data pipeline you forgot to threat model

TL;DR

What is the RAG attack surface?

How effective is corpus poisoning?

Why is access control the biggest gap?

What about embedding-level attacks?

How do you secure the RAG pipeline?

Key takeaways

FAQ

What is corpus poisoning?

Why don’t most RAG implementations have access control?

What is OWASP LLM08?

Can documents be reconstructed from embeddings?

Related across topics

Share on

TL;DR

What is the RAG attack surface?

How effective is corpus poisoning?

Why is access control the biggest gap?

What about embedding-level attacks?

How do you secure the RAG pipeline?

Key takeaways

FAQ

What is corpus poisoning?

Why don’t most RAG implementations have access control?

What is OWASP LLM08?

Can documents be reconstructed from embeddings?

Related across topics

Prompt Injection Defense

Share on