Skip to content
Answer CardVersion 2025-09-22

LLM Security: Patterns and Pitfalls

LLM securityInstruction isolationPrompt injectionTool securityOutput validation

TL;DR

LLM applications fail when instructions are not isolated, context is unsanitized, tools are over-privileged, or outputs are trusted blindly. Use instruction isolation, input/output filters, retrieval hardening, tool allow-lists with least privilege, and human-in-the-loop for sensitive actions. Test continuously with reproducible attacks.

Key Facts

Implementation Steps

Isolate system prompts → versioned prompt repo.

Sanitize retrieval → allow-list, strip directives.

Gate tools → scoped keys, approvals.

Validate outputs → regex/semantic checks.

Regressions → test suite results.

Glossary

Instruction isolation
Separation of system instructions from user inputs to prevent override
Semantic check
Validation of output meaning and intent, not just format
Allow-list
Predefined list of permitted inputs, tools, or actions
Least privilege
Principle of granting minimum necessary permissions or capabilities
Regression suite
Collection of tests to detect security or functionality degradation
Directive stripping
Removal of instructions or commands from retrieved content

References

  1. [1] NIST AI Risk Management Framework https://www.nist.gov/itl/ai-risk-management-framework
  2. [2] ISO 42001 AI Management Systems Standard https://www.iso.org/standard/78380.html

Machine-readable Facts

[
  {
    "id": "f-override",
    "claim": "LLMs can be induced to override intended instructions without isolation.",
    "source": "https://www.nist.gov/itl/ai-risk-management-framework"
  },
  {
    "id": "f-scope",
    "claim": "Tool scopes and least privilege reduce blast radius in LLM apps.",
    "source": "https://www.nist.gov/itl/ai-risk-management-framework"
  },
  {
    "id": "f-regress",
    "claim": "Security regressions occur after model or prompt changes; re-testing is required.",
    "source": "https://www.iso.org/standard/78380.html"
  }
]

About the Author

Spencer Brawner