Security Considerations
This page identifies security risks that implementers MUST consider when building runtimes, parsers, and tooling for Agent Format.
Condition Injection
Threat: args_match values in approval conditions or conditional routing may contain crafted patterns (e.g., complex regexes via the pattern matcher) that cause denial of service (ReDoS) or unintended matches.
Mitigation:
- Runtimes MUST validate
patternmatcher values against a safe regex subset or enforce execution time limits on pattern matching. - Runtimes SHOULD reject patterns that exhibit catastrophic backtracking characteristics.
- Condition evaluation MUST be side-effect-free -- matching logic MUST NOT invoke external services or modify state.
Template Injection
Threat: message_template fields in approval configurations use {{variable}} placeholder syntax. If template variables are populated with untrusted user input, an attacker could inject template directives that expand to unintended content.
Mitigation:
- Runtimes MUST use a strict allowlist of recognized template variables and MUST NOT evaluate arbitrary expressions within template placeholders.
- Template expansion MUST NOT support nested expansion (a variable's value MUST NOT itself be treated as a template).
- Runtimes SHOULD HTML-escape or context-escape expanded values when the approval message is rendered in a UI.
Governance Policy Bypass
Threat: The two-layer constraint model relies on runtimes faithfully enforcing tighten_only_invariant. A non-conforming or compromised runtime could ignore governance policies, weaken constraints, or skip approval requirements.
Mitigation:
- The Governance Team SHOULD validate that runtimes correctly enforce governance policies before granting deployment approval.
- Runtimes MUST apply governance policies after agent-level constraints and MUST NOT expose configuration options that disable governance enforcement.
- Audit logs SHOULD record when governance policies are applied, including the
policy_refand the resulting constraint modifications. - Organizations SHOULD implement external monitoring that independently verifies governance policy application.
Trust Boundaries
Agent Format defines three trust boundaries corresponding to the primary authoring roles:
| Boundary | Role | Trust Level |
|---|---|---|
| Agent definition | Agent Owner | Trusted to define agent capabilities and constraints |
| Runtime configuration | Runtime Owner | Trusted to map abstract definitions to concrete resources (models, endpoints, MCP servers) |
| Governance overlay | Governance Team | Trusted to define organization-wide policies that constrain agent behavior |
Key principles:
- The Agent Owner's definition MUST NOT be able to override governance policies. The
tighten_only_invariantensures governance can only narrow, never widen, agent capabilities. - The Runtime Owner maps
server_refto concrete MCP server configurations, model names to provider endpoints, andsourceURIs to file paths. Runtimes MUST validate that these mappings do not grant capabilities beyond what the agent definition declares. - Tool and skill invocations cross trust boundaries. Approval requirements gate these transitions. Runtimes MUST NOT bypass approval for tools or skills where the effective approval evaluates to required (i.e.,
approval: true, anApprovalConfigobject -- even if empty, or governance-imposed approval).
YAML Parsing Attacks
Threat: YAML parsers are susceptible to several classes of attack that can cause denial of service or information disclosure:
- Billion Laughs (YAML Bomb): Deeply nested YAML anchors and aliases can expand exponentially, consuming unbounded memory. A small
.agf.yamlfile (< 1 KB) can expand to gigabytes when anchors are recursively dereferenced. - Arbitrary object instantiation: Some YAML libraries (particularly in Python and Ruby) support language-specific tags (e.g.,
!!python/object) that can instantiate arbitrary objects, leading to remote code execution. - Extremely large scalar values: A single string field containing megabytes of text can exhaust memory during parsing.
Mitigation:
- Parsers MUST disable language-specific YAML tags. Only the JSON-compatible YAML subset (strings, numbers, booleans, nulls, arrays, mappings) is valid in
.agf.yamlfiles. - Parsers MUST limit alias dereferencing depth and expansion size. A conforming parser SHOULD reject documents that expand beyond a configurable threshold (recommended default: 10 MB expanded size).
- Parsers SHOULD enforce maximum scalar value lengths (recommended default: 1 MB per scalar).
- Parsers SHOULD enforce maximum document nesting depth (recommended default: 64 levels).
Instruction Injection
Threat: The instructions field in agf.react execution policy config is passed to the underlying LLM verbatim. If the instructions are generated or influenced by untrusted input (e.g., assembled from templates that include user-controlled data), an attacker can inject prompt directives that override the agent's intended behavior.
Additionally, message_template values in approval configurations could be crafted to mislead human reviewers -- for example, by including newlines or Unicode control characters that make an approval request appear benign when it is not.
Mitigation:
- Agent Owners SHOULD treat the
instructionsfield as security-sensitive and MUST NOT construct it from untrusted input at authoring time. - Runtimes SHOULD sanitize
message_templateoutput by stripping or escaping Unicode control characters (categories Cc and Cf, excluding common whitespace) before presenting to human reviewers. - Governance policies MAY augment
instructions(e.g., prepending safety guidelines) but MUST NOT allow external input to flow into the augmentation. - Runtimes SHOULD implement output guardrails that detect and flag responses inconsistent with the declared agent persona.
Cross-Agent Data Leakage
Threat: In multi-agent systems, data flows between agents via input_mapping and output_from. Without proper isolation:
- A sub-agent with lower
data_classificationmay receive confidential data from a parent agent viainput_mapping. - A sub-agent's output may inadvertently include sensitive data from its memory context when
memory_scope_strategyis set toinherit. - Remote agents (A2A) operate across organizational boundaries. Data sent to a remote agent leaves the Runtime Owner's control.
Mitigation:
- Runtimes SHOULD validate that data flowing via
input_mappingdoes not crossdata_classificationboundaries without explicit governance approval. A parent agent withdata_classification: confidentialSHOULD NOT send data to a sub-agent classified aspublicunless a governance policy explicitly permits it. - When
memory_scope_strategyisinherit, the runtime SHOULD ensure that memory contents respect the child agent's data classification level. - For
remote_agents, runtimes SHOULD apply data classification checks before transmitting data over A2A. Theallowed_skillsallowlist limits the surface area of cross-boundary data flow. - Organizations SHOULD use governance policies to enforce data flow rules based on
data_classificationandnamespace.
Source Path Traversal
Threat: local_agents[].source with source_type: file accepts a path that resolves to a file. An attacker who controls the agent definition could reference paths outside the expected directory (e.g., ../../etc/secrets.yaml or absolute paths to sensitive locations).
Mitigation:
- Runtimes MUST resolve
sourcepaths relative to a configured base directory and MUST reject paths that traverse above that directory (i.e., reject..segments after path normalization). - Runtimes MUST NOT follow symbolic links that resolve outside the base directory.
- When
source_typeisregistryordb, runtimes MUST validate thesourcevalue against the expected identifier format and MUST authenticate with the backing store before retrieval. - Runtimes SHOULD log all source resolution attempts, including rejected paths, for audit purposes.