Skip to content

About this document

This is a working reference for securing modern AI systems - models, the cloud they run in, retrieval, agents, the protocols that connect them (MCP, A2A), coding agents, and the frontier-safety and governance regimes forming around them. It is compiled and maintained by Iaroslav Mezin as a living document, revised continuously as the field moves. It is written for a technically literate reader: security practitioners, red-teamers, AI and platform engineers, and advanced students. It assumes comfort with security fundamentals and a working mental model of how machine-learning systems behave.

One idea organizes everything that follows. For modern AI systems the decisive security boundary is rarely the model’s raw output - it is the path from untrusted content in to privileged action out. Read the whole document through five recurring boundaries: inputs (prompts, retrieved documents, tool output, protocol metadata), the model and runtime, memory and context, tools and actions, and external assets and identities. Retrieval, browser agents, coding agents, MCP, and identity all turn out to be variations on the same theme once those five are held in view.