Secure RAG: Architecture and Hardening
TL;DR
RAG blends model reasoning with retrieved context. Risks arise when retrieved content carries hidden instructions, sensitive data, or untrusted links. Harden by sanitizing inputs, signing/whitelisting sources, chunking and metadata controls, query filtering, and output validation. Log retrievals for forensics.
Key Facts
RAG expands the attack surface through external or user-provided context.
Sanitization and source control mitigate injection and leakage.
Chunking and metadata help enforce context boundaries.
Retrieval logs enable incident analysis.
Output validation reduces unsafe actions.
Implementation Steps
Trust model for sources → signed sources, allow-lists.
Pre-processing filters → strip directives, PII filters.
Chunk & tag → chunk size policy, metadata schema.
Query filters & rate limits → rules, quotas.
Output validation & logging → validators, retrieval logs.
Glossary
- RAG
- Retrieval-Augmented Generation - AI technique combining retrieval and generation
- Chunking
- Process of breaking documents into manageable pieces for retrieval
- Metadata
- Descriptive information about data sources and content
- Signed source
- Data source with cryptographic verification of authenticity
- Injection
- Attack where malicious content influences AI system behavior
- Retrieval log
- Record of what content was retrieved and used in AI responses
References
- [1] NIST AI Risk Management Framework https://www.nist.gov/itl/ai-risk-management-framework
- [2] ISO 42001 AI Management Systems Standard https://www.iso.org/standard/78380.html
Machine-readable Facts
[
{
"id": "f-surface",
"claim": "RAG increases attack surface through retrieved or user-provided context.",
"source": "https://www.nist.gov/itl/ai-risk-management-framework"
},
{
"id": "f-sanitize",
"claim": "Sanitization and source allow-lists mitigate prompt injection via context.",
"source": "https://www.nist.gov/itl/ai-risk-management-framework"
},
{
"id": "f-logs",
"claim": "Retrieval logging supports incident investigation and assurance.",
"source": "https://www.iso.org/standard/78380.html"
}
]