aiio LeitWerk - Generative UI Hackathon w/ Google and CoPilotKit
Virtual Events
Hackathon Showcase Finalist

aiio LeitWerk

aiio GmbH is prototyping an AG-UI/A2UI control surface to help humans steer agents through complex, structured enterprise process ontologies.

3 members Watch Demo

aiio LeitWerk’s UI adapts itself to the question being asked.

The same backend data renders as a vertical agenda when a user needs to act, a 6×5 heatmap when they need to see, or a deck of three decision cards when they need to choose. The agent decides at runtime which shape fits — based on the intent in how the question was framed, not based on who asked. Identity is a normal session attribute. Intent is what the LLM classifies. The combination is what determines the UI.

This is what we think “agentic” should actually look like when it’s more than chat with extra steps.

How the routing actually works

Two inputs, one decision:

  • Identity comes from the session — worker ID, role, department, location. Same way every real app does it. The LLM doesn’t guess this.
  • Intent comes from the message. “What’s on my plate today?” is asking-to-act. “Where are we at risk?” is asking-to-see. “How should we handle this?” is asking-to-choose. The LLM classifies the intent.

Together they determine the UI. A supervisor asking “how should we handle this week” gets a decision deck scoped to her authority. A VP asking the same question gets a network-wide one. The agent doesn’t generate three UIs because there are three personas — it generates the UI that fits the question this person is asking right now. We built two routing branches in the six hours: one where identity is set explicitly via a dropdown, one where the LLM analyzes the question’s framing and routes accordingly. The demo uses the dropdown version for clarity.

Three reasons this isn’t a chatbot in a trenchcoat

  1. The interface adapts to intent, not just content. A chatbot has one modality. A dashboard has many you have to navigate to. aiio LeitWerk’s first decision per request is what shape of artifact does this person need right now — agenda, heatmap, decision deck, or something else? The LLM routes from natural language to UI shape, and streams typed component props to the frontend.
  1. The agent reasons across heterogeneous live data. A real domain knowledge graph (running in FalkorDB) plus three JSON snapshots simulating Workday HR data, SAP maintenance records, and ETQ-style quality data. The agent joins them at runtime to surface conclusions no single source would expose — for example, that the internal auditor scheduled to lead next week’s cycle has a certificate that expired five days ago. That insight needs the graph (Quality is responsible for internal audits) AND the live data (this specific worker’s cert is overdue). Neither alone gets there.
  1. Strategies are synthesized, not selected from a menu. When the VP asks how to handle the week, the agent composes three coherent plays — Audit-First, Compliance Surge, Risk-Triage — each with situation-specific pros, cons, and a headline metric, all reasoning over the live state. A traditional system shows options. Only an LLM can synthesize them with tradeoffs that account for this week’s specifics.

The demo we built

A manufacturing site is hosting a customer audit on Thursday. The week is full of operational landmines — certifications expiring mid-week, an internal auditor whose qualification just lapsed, a maintenance window that closes Wednesday. Three different roles ask the agent the question that fits their job. The production supervisor asks what’s on her plate today and gets her agenda. The quality manager asks where the risk is and gets the heatmap. The VP asks how to handle the week and gets three strategy cards. She picks Audit-First. The agent registers the decision as an active policy in typed server-side state — Applying…

In a production deployment, that policy state drives the autonomous execution layer — calendar blocks, drafted communications, scheduled follow-ups for the rest of the week, propagated changes into the supervisor’s view. In our six-hour build, we shipped the decision routing and the apply intent. The execution layer is the next step.

We picked manufacturing compliance as the domain because it’s a place where generic assistants fail catastrophically — the relevant data lives in four different systems, the personas have wildly different needs, and the cost of being wrong is measured in customer contracts. But the pattern generalizes: anywhere a decision-maker has to act on heterogeneous live state and have consequences ripple through a team, this shape applies.

How we built it

Backend. Python · FastAPI · Uvicorn · Pydantic for typed request/response. Anthropic Claude for the LLM calls, orchestrated through LangChain. The agent does intent classification, lens routing, and strategy synthesis as live model calls — type a different question, click a different option, get fresh output.

Data layer. Knowledge graph hosted in FalkorDB via Docker Compose with persistence; queried with the Cypher-over-Redis protocol and warmed into in-memory structures for fast lens rendering. Three JSON snapshots (hr.json, qms.json, cmms.json) carry live operational records linked back to KG IDs — joined at runtime by the agent to compose each lens’s props.

Frontend. React · TypeScript · A2UI for the agent-to-UI streaming protocol. Three lens components take typed props; the agent decides at runtime which to render and streams the props in.

What’s real, what’s stubbed

Real: intent classification, lens routing, strategy synthesis, the FalkorDB-backed knowledge graph, the cross-source join with HR/QMS/CMMS data, the typed prop streaming over A2UI. Type a different question, get a different lens.

Stubbed: real Workday and SAP connectors (the JSON snapshots stand in), email send (drafts only), real time (the demo clock is fixed), and the autonomous execution layer that would fire after the apply moment. The seams are typed; swapping fixtures for live systems is a one-class change, not a rewrite.

One sentence to remember: The UI adapts to the question. That’s the dashboard, killed.

aiio’s ongoing research focuses on organizational intelligence — how operations encode their own knowledge in process ontologies and domain knowledge graphs, and how agentic systems can reason and act on top of them. The team came to the hackathon with three pre-existing artifacts: a synthetic knowledge graph (25 processes, 13 departments, RACI relationships, regulation edges) drawn from our existing demo work, three synthetic operational-data fixtures shaped like typical Workday / ETQ / SAP PM exports, and a written technical specification. These are design and data assets — no executing code. The agent loop, the three lens components, the AG-UI streaming integration, the autonomous follow-up scheduler, the watch engine, and the cross-persona propagation were all built during the six-hour window.

Anthropic Google DeepMind LangChain

This is the repo for our submission project

Summarizing URL...