Capability · the engineering layer

The depth underneath every system we ship.

Most studios ship a thin app over a SaaS rental. We engineer a research-grade substrate underneath: formal verification, signed receipts, multi-agent supervision, recursive self-improvement. This page is the technical due-diligence room. If you’re evaluating XXI for a serious build, read on.

I

// When the system has to be more than software

Beyond chatbots. Systems that think, verify, and improve.

Some projects need more than a clean UI and a database. When the work has to hold under audit, refuse to hallucinate, and get sharper week over week, the system needs an engineering layer underneath that most studios don’t reach for. Math, verification, memory, autonomy, learning — engineered into the project from day one.

— What sits at this tier —

i

Autonomous decision engines

Quantitative agents that observe, decide, and act on real business workflows without supervision. Platt-calibrated scoring, Kalman-tracked latent state, Thompson-sampled experimentation, KKT-optimised allocation.

For: revenue operations, capital allocation, dynamic pricing, fraud triage, portfolio decisions. The math is the moat.

ii

Verified reasoning systems

Neuro-symbolic agents. The language model proposes; a formal solver verifies. Z3 / SMT / constraint logic wraps every consequential decision before it fires.

For: compliance evidence, contract review, authorisation logic, regulated-industry workflows. Hallucination-proof on the paths that matter.

iii

Cognitive architectures

Multi-agent supervisors with three-tier memory (working / episodic / semantic), hierarchical planning, self-critique loops, durable execution. Whole teams of specialised agents, coordinated.

For: operations that take a junior team to run today. Replace the team. Keep the institutional memory.

iv

Recursive self-improving systems

Agents that get better while you sleep. Eval suites as training signal, DSPy-compiled prompts, bandits over strategies, anytime-valid A/B testing. Every outcome feeds the next decision.

For: any system that runs for weeks. The system you receive on day one is a baseline. Day ninety, it has compounded.

v

Mission-critical AI auditing

Public eval dashboards, trace viewers, formal verification layers. Every decision the system makes is replayable, diffable, attributable. Proof the system holds, not promises.

For: SOC 2 prep, regulated industries, board-level AI risk reviews. Bring us the existing system, leave with the audit-ready version.

II

// Substrate · Polyglot

The layer underneath every agentic system we ship.

Polyglot compiles any reachable API — OpenAPI, GraphQL, or undocumented — into a sandboxed, audit-signed TypeScript or Python tool. Every tool ships with a 1 KB Ed25519 receipt your runtime, your auditor, and any third party can independently verify. No SDK. No daemon. No trust required.

62
enforcement rules
29 TypeScript · 33 Python
45
named attack vectors blocked
regression-gated in CI · published matrix
1 KB
receipt size
hashed by reference, not by inline
< 200 µs
verifyReceipt latency
WebCrypto Ed25519 · client-side
3
cryptographic guarantees
identity · type-safety · payload
RFC 6962
audit log primitive
Merkle inclusion + consistency proofs
RFC 3161
timestamp anchor
Ed25519 · no X.509 ceremony
0 LLM
in the verification path
after compile · deterministic forever

— Three cryptographic guarantees —

i

Cryptographic identity isolation

A 32-byte master seed never leaves process RAM. HKDF-SHA256 derives per-blueprint and per-invocation scoped Ed25519 keypairs. Compromise of any scoped key leaks only that scope — HKDF is one-way; nothing reverses to the master.

ii

Type-safe AST compilation

Every synthesized module passes 29 versioned ts-morph rules (TypeScript) or 33 ast-module rules (Python). Any critical or high-severity finding refuses the compile at the firewall, before sandbox execution. The policy version and content hash are embedded in every receipt so an auditor five years from now can replay the exact ruleset that accepted it.

iii

Mathematical payload verification

18 deterministic injection detectors plus 5 schema-conformance walkers run on every candidate payload at the firewall, before the synthesized code touches any network line. Cloud-metadata SSRF hosts, JWT-shaped secrets, Luhn-validated credit cards, prototype-pollution keys, and 14 others — evidence-masked when fired, versioned forever.

Mapped to GDPR · HIPAA · SOX · PCI-DSS · EU AI Act Article 12 · NIST AI RMF · SOC 2 · ISO 27001 A.8.28 in the formal STRIDE threat model document. Every receipt is the artifact your auditor asks for.

policy v1.1.0 · substrate state at this build · commit-pinned

Read the substrate doc Verify a receipt No login. No backend. Client-side WebCrypto.
III

// Currently engineering for ourselves

Tools we build for ourselves before they reach your project.

A small set of internal primitives we engineer against our own work first. They earn their way into a client project only after they’ve held under real load on ours. Quiet, deliberate, math-first.

Substrate

Persistent memory kernels

Content-addressed storage, Matryoshka-tiered vectors, behavioural transition graphs. The memory underneath every agent we ship. Open-sourced as Spine.

Verification

Formal safety envelopes

Z3 / SMT wraps around any consequential decision. Hallucination-proof paths for the operations that must hold under audit.

Orchestration

Multi-agent meshes

Specialised agents that compose into systems with role differentiation, observable traces, and economic supervision of compute allocation.

Learning

Recursive self-improvement

Evals as training signal, DSPy-compiled prompts, anytime-valid testing. Every outcome becomes feedback for the next decision.

Public notes on these primitives are pinned for the engineers who want depth. We do not sell the substrate. We use it.

IV

// If you’re evaluating XXI for a serious build

Bring the operation. We’ll show the receipts.

If your project needs any of the depth on this page — formal verification, signed audit trails, multi-agent supervision, recursive self-improvement — book a discovery call. We’ll walk you through working systems on this substrate, not slide decks.

Book a 30-minute call