Reliable UIs Even With Language Models

Google recently released A2UI, a protocol for agent-driven interfaces. The Hacker News discussion had a lot of questions: Does the model run on every click? What happens when it hallucinates? How do you test this?

These are the right questions. The answers depend on choices and trade-offs that most discussions skip over.

Two decisions that matter

Any system using LLMs to drive UI sits somewhere on two axes:

How constrained is the interface? Can the model output arbitrary HTML, or only select from a fixed set of components backed by a typed API?

When does the model run? On every interaction (just-in-time), or once upfront to generate logic that executes later (ahead-of-time)?

Interfaces are the bottleneck

In theory, a sufficiently capable model could operate over any interface: raw HTML, an unbounded toolset, whatever you throw at it.

In practice, we’re not there yet. Open-ended interfaces explode the space of possible behaviors, making systems harder to test, debug, and trust.

The fix: constrain what’s possible, not how it’s decided.

Constrained presentation

A2UI gets this right on the presentation layer. Agents send declarative component descriptions that clients render using native widgets. The model selects from a fixed catalog of pre-approved components. No arbitrary HTML injection.

We do the same thing. Our controllers return structured data that maps to a fixed set of UI components. The model can’t render whatever it wants.

And here at Cased, where we are focused on infrastructure (so repeatability and reliable are critical), this is very important.

Constrained runtime

We go further. Beyond presentation, we also constrain the runtime: the backend API the generated code can call.

Instead of giving the model access to unbounded tools, we expose a deterministic, capability-scoped API. The model generates glue logic that connects our UI schema to this constrained runtime. The how is up to the model. The where and what are fixed.

Just-in-time vs ahead-of-time

A2UI keeps the model in the loop: agents reason at runtime, emit UI descriptions, and clients render them as they stream in. Each user action can trigger a new inference cycle.

The problem is latency. Click a button, wait for inference. Filter a table, wait for inference. The UI feels sluggish because every interaction round-trips through a model. You’ve built a responsive frontend on top of an inherently slow backend.

We do it differently. The model runs once, ahead of time, generating a controller. That controller then executes deterministically, no model in the loop. Filtering, sorting, drilling down: all instant, because you’re just running code.

This is closer to compile vs interpret. By moving reasoning out of the hot path, you get responsiveness, replayability, and predictable behavior.

The tradeoff

Just-in-time reasoning is more flexible. The UI can adapt dynamically to complex agent workflows.

Ahead-of-time generation is more predictable. Once a view exists, it behaves the same way every time. You can test it, share it, debug it.

Neither is universally right. They’re different points in the design space.

Ahead-of-time generation fits when:

The UI needs to feel interactive, not stale
The same view will be used and shared repeatedly
You need to test or audit the exact behavior

Just-in-time reasoning fits when:

The UI adapts to unpredictable agent state
Interactions are conversational or exploratory
Latency is acceptable for the task
Flexibility matters more than repeatability

Most operational and infra tools fall into the first category. Most agent-driven chat experiences fall into the second. The mistake is treating them as the same problem.

Why clarity matters

“AI-generated UI” could mean an LLM writing HTML on every request, a protocol streaming component descriptions, or a system that generates code once and runs it forever. These have radically different reliability and cost profiles.

We don’t yet have a shared vocabulary for these tradeoffs yet, but A2UI and our work at Cased hopefully gets it started.

We think constrained interfaces, both presentation and runtime, plus ahead-of-time generation is the practical choice for operational tools today. The models will improve. The interfaces can expand later. For now, constraints are a feature.