Skip to main content

Why a Standard?

The Problem

The AI agent infrastructure landscape moves fast. Frameworks rise and fall, model providers change APIs, orchestration patterns evolve. Organizations building agents face a common challenge: every new agent means rebuilding the same components — tool wiring, retry logic, token budgets, output parsing. And every platform upgrade — a new model provider, a different tool protocol, a runtime migration — means touching every agent's code. Teams move fast, but they don't scale.

The pattern is familiar. The same thing happened with infrastructure before containers, with deployments before CI/CD, with configuration before Infrastructure-as-Code. The agent ecosystem is in its "imperative scripts" era: powerful but fragmented, with every team building bespoke solutions that can't be shared, composed, or governed uniformly.

The intuitive solution is clear: let agent developers describe what their agent is, and let the platform figure out how to run it. A clean contract between agent owners and platform owners — an Interface Definition Language for AI agents — so that agent definitions never have to change just because the infrastructure underneath them changed.

This declarative layer needs to serve two purposes simultaneously:

  1. Documentation — a human-readable description of what a complex agent system does, so anyone (developers, security teams, auditors) can understand it at a glance.
  2. Configuration — a machine-readable specification where the agent's behavior changes when the config changes, without touching code.

Why Not Existing Solutions?

Many commercial and open-source agent platforms offer declarative layers for defining agents. The most common approach is the graph model — nodes and edges — which is powerful for expressing complex execution flows with branching, merging, and cycles. However, graph-based definitions can be hard to understand without a visualization tool, especially as agent complexity grows. More fundamentally, graphs emphasize the execution flow — how steps connect — rather than the agent itself.

Agent Format takes a different approach: a format that a developer can read in a text editor, a security team can audit as a file, and a platform can execute without ambiguity.

Back to First Principles

The starting question is: how do you model an AI agent?

This question has been studied for decades in the field of decision-making under uncertainty. The established answer is the POMDP (Partially Observable Markov Decision Process) — a mathematical framework for agents that observe partial information, maintain beliefs about their state, take actions, and receive feedback.

A POMDP-based agent has a well-defined structure:

  • State and observations — what the agent knows (its memory, context, inputs)
  • Action space — what the agent can do (tools, sub-agents, MCP servers)
  • Policy — how the agent decides what to do next (the execution strategy)
  • Constraints — what limits the agent operates within (budgets, approvals, governance)

This isn't just theory. Every modern agent framework — ReAct loops, sequential pipelines, parallel tool calls — can be understood as a specialization of this model, whether they name it or not.

Agent Format makes this structure explicit and declarative. Instead of burying the agent's architecture in imperative code, we surface it as a readable YAML document where each POMDP concern maps to a schema section:

ConcernAgent Format Section
Identity and interfacemetadata + interface
State and memorymemory
Action spaceaction_space (tools, sub-agents, MCP servers)
Policyexecution_policy (ReAct, sequential, parallel, etc.)
Constraints and rewardsconstraints + approval + governance_policies

What Agent Format Solves

The result is a single declarative format that addresses the core problems:

  1. Separates definition from execution — the YAML declares WHAT the agent is, the runtime decides HOW to run it. Platform teams upgrade infrastructure without breaking agents. Agent owners modify behavior without understanding infrastructure.
  2. Is portable across runtimes — the same .agf.yaml runs on any compliant SDK. No vendor lock-in. Switch execution engines with zero code changes.
  3. Is human-readable — unlike graph-based definitions, a YAML file can be read in a text editor, reviewed in a pull request, diffed in version control, and explained to an auditor.
  4. Enables governance at scale — auditors can read YAML instead of source code. Policies compose automatically. A declarative format enables static analysis of agent capabilities before deployment — you can verify invariants like "no agent with financial write access can use public internet tools" without running anything.
  5. Supports graduated complexity — a hobby project uses 10 lines; an enterprise deployment uses 100. The same format, the same tools, no rewrite required.
  6. Is schema-validated — every .agf.yaml is validated against a JSON Schema. Misconfigurations are caught at authoring time, not at 3 AM in production.

To understand how these principles manifest in the format design, see the six-role ecosystem that Agent Format serves. For details on how Agent Format integrates with MCP, A2A, and existing frameworks, see Integrations & Ecosystem.