GuardiAgent Securing AI Agent Execution

Large language models have evolved into AI agents that interact with external tools and environments. GuardiAgent is the first access‑control framework for MCP servers, securing these interactions without compromising efficiency.

Key Highlights

Access Control Policy
GuardiAgent introduces an access control policy mechanism for Model Context Protocol (MCP) servers, inspired by the Android permission model. Permissions are specified via manifest files that express required privileges.
Policy Enforcement Engine
Our evaluation shows that access control policies can be generated automatically for existing MCP servers and are correct in most cases. Developers confirmed that 100% of required permissions are captured and that 80.9% of the generated manifests need no human intervention.
Security and Efficiency
GuardiAgent's policy enforcement engine effectively mitigates malicious behaviours such as external resource attacks and data exfiltration while introducing only a small runtime overhead. On average, it adds about 0.6 ms per request, demonstrating that strong isolation can be achieved with negligible performance cost.

Why Securing AI Agents Matters

AI agents integrate large language models with external tools - file systems, databases, web APIs and more - to perform complex tasks. While tools use unlocks powerful capabilities, it also expands the attack surface. Thousands of MCP servers expose privileged operations without adequate isolation or permission checks. A malicious server can compel the agent, through prompt injection, to exfiltrate data, execute arbitrary code or misuse resources, as the agent inherits the full privileges of the host process.

Unlike mature ecosystems that pair system permissions with enforced runtime behaviour, the MCP specification focuses only on the communication protocol. GuardiAgent closes this gap by bringing principled access control to the MCP servers. It guarantees that they comply with the manifest and blocks any unauthorised access, thereby preserving the boundary between the MCP server and the host system.

How GuardiAgent Works

GuardiAgent enforces each MCP server with a manifest that declares the operations it can perform (e.g., read file, write file, fetch URL). At runtime, the policy enforcement engine guarantees the behaviour of the MCP server. Only calls that conform to the manifest are executed; all others are blocked, preventing capability escalation and data leakage.

GuardiAgent framework architecture

To simplify adoption, GuardiAgent can automatically generate manifests from existing server source code. The MCP servers do not need any modifications to run with GuardiAgent. Developers can review and refine these manifests, but our study shows that the automatically generated policies are largely correct and approved by developers. This approach lowers the barrier to securing the vast ecosystem of MCP servers with limited human intervention.

Evaluation & Results

We evaluated GuardiAgent on a variety of real‑world MCP servers and malicious workloads. The framework demonstrates strong security guarantees and excellent performance

List of experiments and results in our paper
ExperimentResult
Automatic manifest accuracy80.9% of generated manifests are correct without manual adjustments
Permission coverage100% of required permissions are captured by the policy vocabulary
SecuritySuccessfully mitigates malicious behaviours such as external resource attacks and data exfiltration
Performance overhead~0.5ms average per interaction

These results indicate that GuardiAgent provides robust protection without sacrificing responsiveness. By injecting a lightweight enforcement layer, developers can adopt a least‑privilege security model for AI agents with minimal effort.