Mental Model
The core insight of A2UI: agents express what they want the UI to do; the UI decides how to render it.
The Problem with Traditional Approaches
Raw Text Parsing
Agent outputs unstructured text. Client must parse, guess intent, and hope formatting is consistent. Brittle and error-prone.
Rigid JSON
Agent outputs custom JSON schemas per use case. No interoperability between different agents or UIs. Every integration is bespoke.
The A2UI Approach
Standardized Intents
Agent emits UI intents from a known vocabulary. Any conforming renderer can display them while maintaining visual flexibility. The protocol defines semantics, not presentation.
Separation of Concerns
| Layer | Responsibility | Owner |
|---|---|---|
Reasoning | Decide what to show/ask | Agent |
Intent | Express UI needs structurally | A2UI Protocol |
Rendering | Visual presentation | UI Framework |
Example Flow
Agent reasoning (internal)
"User asked about the weather. I should show current conditions."
Agent emits A2UI intent
{
"type": "Card",
"title": "Current Weather",
"fields": [
{ "label": "Temperature", "value": "72°F" },
{ "label": "Conditions", "value": "Sunny" }
]
}UI renders appropriately
Key Benefits
Model-agnostic
Works with any LLM that can emit structured output
Renderer-independent
Same intent works across web, mobile, CLI, voice
Explainable
Intents are inspectable and debuggable
Evolvable
New primitives can be added without breaking existing ones