Skip to main content

Security Considerations

This page identifies security risks that implementers MUST consider when building runtimes, parsers, and tooling for Agent Format.

Condition Injection

Threat: args_match values in approval conditions or conditional routing may contain crafted patterns (e.g., complex regexes via the pattern matcher) that cause denial of service (ReDoS) or unintended matches.

Mitigation:

  • Runtimes MUST validate pattern matcher values against a safe regex subset or enforce execution time limits on pattern matching.
  • Runtimes SHOULD reject patterns that exhibit catastrophic backtracking characteristics.
  • Condition evaluation MUST be side-effect-free -- matching logic MUST NOT invoke external services or modify state.

Template Injection

Threat: message_template fields in approval configurations use {{variable}} placeholder syntax. If template variables are populated with untrusted user input, an attacker could inject template directives that expand to unintended content.

Mitigation:

  • Runtimes MUST use a strict allowlist of recognized template variables and MUST NOT evaluate arbitrary expressions within template placeholders.
  • Template expansion MUST NOT support nested expansion (a variable's value MUST NOT itself be treated as a template).
  • Runtimes SHOULD HTML-escape or context-escape expanded values when the approval message is rendered in a UI.

Governance Policy Bypass

Threat: The two-layer constraint model relies on runtimes faithfully enforcing tighten_only_invariant. A non-conforming or compromised runtime could ignore governance policies, weaken constraints, or skip approval requirements.

Mitigation:

  • The Governance Team SHOULD validate that runtimes correctly enforce governance policies before granting deployment approval.
  • Runtimes MUST apply governance policies after agent-level constraints and MUST NOT expose configuration options that disable governance enforcement.
  • Audit logs SHOULD record when governance policies are applied, including the policy_ref and the resulting constraint modifications.
  • Organizations SHOULD implement external monitoring that independently verifies governance policy application.

Trust Boundaries

Agent Format defines three trust boundaries corresponding to the primary authoring roles:

BoundaryRoleTrust Level
Agent definitionAgent OwnerTrusted to define agent capabilities and constraints
Runtime configurationRuntime OwnerTrusted to map abstract definitions to concrete resources (models, endpoints, MCP servers)
Governance overlayGovernance TeamTrusted to define organization-wide policies that constrain agent behavior

Key principles:

  • The Agent Owner's definition MUST NOT be able to override governance policies. The tighten_only_invariant ensures governance can only narrow, never widen, agent capabilities.
  • The Runtime Owner maps server_ref to concrete MCP server configurations, model names to provider endpoints, and source URIs to file paths. Runtimes MUST validate that these mappings do not grant capabilities beyond what the agent definition declares.
  • Tool and skill invocations cross trust boundaries. Approval requirements gate these transitions. Runtimes MUST NOT bypass approval for tools or skills where the effective approval evaluates to required (i.e., approval: true, an ApprovalConfig object -- even if empty, or governance-imposed approval).

YAML Parsing Attacks

Threat: YAML parsers are susceptible to several classes of attack that can cause denial of service or information disclosure:

  • Billion Laughs (YAML Bomb): Deeply nested YAML anchors and aliases can expand exponentially, consuming unbounded memory. A small .agf.yaml file (< 1 KB) can expand to gigabytes when anchors are recursively dereferenced.
  • Arbitrary object instantiation: Some YAML libraries (particularly in Python and Ruby) support language-specific tags (e.g., !!python/object) that can instantiate arbitrary objects, leading to remote code execution.
  • Extremely large scalar values: A single string field containing megabytes of text can exhaust memory during parsing.

Mitigation:

  • Parsers MUST disable language-specific YAML tags. Only the JSON-compatible YAML subset (strings, numbers, booleans, nulls, arrays, mappings) is valid in .agf.yaml files.
  • Parsers MUST limit alias dereferencing depth and expansion size. A conforming parser SHOULD reject documents that expand beyond a configurable threshold (recommended default: 10 MB expanded size).
  • Parsers SHOULD enforce maximum scalar value lengths (recommended default: 1 MB per scalar).
  • Parsers SHOULD enforce maximum document nesting depth (recommended default: 64 levels).

Instruction Injection

Threat: The instructions field in agf.react execution policy config is passed to the underlying LLM verbatim. If the instructions are generated or influenced by untrusted input (e.g., assembled from templates that include user-controlled data), an attacker can inject prompt directives that override the agent's intended behavior.

Additionally, message_template values in approval configurations could be crafted to mislead human reviewers -- for example, by including newlines or Unicode control characters that make an approval request appear benign when it is not.

Mitigation:

  • Agent Owners SHOULD treat the instructions field as security-sensitive and MUST NOT construct it from untrusted input at authoring time.
  • Runtimes SHOULD sanitize message_template output by stripping or escaping Unicode control characters (categories Cc and Cf, excluding common whitespace) before presenting to human reviewers.
  • Governance policies MAY augment instructions (e.g., prepending safety guidelines) but MUST NOT allow external input to flow into the augmentation.
  • Runtimes SHOULD implement output guardrails that detect and flag responses inconsistent with the declared agent persona.

Cross-Agent Data Leakage

Threat: In multi-agent systems, data flows between agents via input_mapping and output_from. Without proper isolation:

  • A sub-agent with lower data_classification may receive confidential data from a parent agent via input_mapping.
  • A sub-agent's output may inadvertently include sensitive data from its memory context when memory_scope_strategy is set to inherit.
  • Remote agents (A2A) operate across organizational boundaries. Data sent to a remote agent leaves the Runtime Owner's control.

Mitigation:

  • Runtimes SHOULD validate that data flowing via input_mapping does not cross data_classification boundaries without explicit governance approval. A parent agent with data_classification: confidential SHOULD NOT send data to a sub-agent classified as public unless a governance policy explicitly permits it.
  • When memory_scope_strategy is inherit, the runtime SHOULD ensure that memory contents respect the child agent's data classification level.
  • For remote_agents, runtimes SHOULD apply data classification checks before transmitting data over A2A. The allowed_skills allowlist limits the surface area of cross-boundary data flow.
  • Organizations SHOULD use governance policies to enforce data flow rules based on data_classification and namespace.

Source Path Traversal

Threat: local_agents[].source with source_type: file accepts a path that resolves to a file. An attacker who controls the agent definition could reference paths outside the expected directory (e.g., ../../etc/secrets.yaml or absolute paths to sensitive locations).

Mitigation:

  • Runtimes MUST resolve source paths relative to a configured base directory and MUST reject paths that traverse above that directory (i.e., reject .. segments after path normalization).
  • Runtimes MUST NOT follow symbolic links that resolve outside the base directory.
  • When source_type is registry or db, runtimes MUST validate the source value against the expected identifier format and MUST authenticate with the backing store before retrieval.
  • Runtimes SHOULD log all source resolution attempts, including rejected paths, for audit purposes.