Three loops, three timescales
Every working robot brain is a hierarchy of loops running at wildly different speeds:
- Reflex tier — 1 kHz. Torque control, balance, contact response. Pure control theory; no learning surprises allowed here. If this tier misses deadlines, the robot is on the floor before the next tier notices.
- Skill tier — 10–100 Hz. Learned policies: walk, grasp, place, open-door. Today usually neural networks trained in simulation, outputting joint targets the reflex tier tracks.
- Deliberation tier — 0.1–1 Hz. Task planning: decompose "clear the workbench" into a skill sequence, monitor progress, recover from failure. This is where language models entered robotics.
Where the LLM plugs in (and where it must not)
An LLM is a superb deliberator: it knows that screws go in boxes and that you don't put the box on the wet paint. It is a catastrophic reflex controller: 200 ms of token latency is two falls. The architecture rule is strict — language plans, policies act, control survives. The LLM emits skill calls with parameters; it never touches a motor.
VLAs compress the middle
Vision-language-action models (our VLA guide) merge perception and skill selection into one network: pixels + instruction → actions. The hierarchy doesn't disappear — the reflex tier still guards the hardware — but the skill tier becomes general instead of a library of hand-trained behaviors.
Memory and world state
Between the tiers sits a world state: object poses, task progress, semantic map. The deliberator reads and writes it; the new ingredient is the world model — a learned simulator the robot can query: "if I pull this tote, what happens?" Prediction before action is the difference between competence and confidence.
The enterprise echo
Strip the motors and this is exactly the architecture of a good business agent: deterministic guardrails at the bottom (your reflex tier: validation, permissions), specialized tools in the middle, and a language model deliberating on top — never executing directly. We build that stack on-prem for documents and processes; robotics just makes the layering visible because failures fall over.
Localized AI fine-tunes small open models on your data and deploys them on your hardware — GDPR by architecture, zero per-token costs. Average setup: 72 hours.
Plan my deployment