_INVERSED · SALON

An Architecture of
Trustable Agents

Good social behaviors. Hard limits. Creative constraint.

20 minutes · then we split into three rooms.
_INVERSED · SALON

_Who We Are

Inversed

We come from applied cryptography and the security of high-stakes systems — MPC, ZK, biometric identity at global scale (Worldcoin, Aria). We bring that engineering bar to a new problem: making autonomous AI agents safe to deploy.

Aurel — CTO. Background in cryptography-heavy systems. The team scales frontier crypto into production-grade infrastructure for the agentic age.

_INVERSED · SALON

_The Product

Threshold

A control plane for AI agents.

It sits between agents and everything they touch. Credentials never reach the agent. Policies live in the runtime, not in prompts. Every action is annotated and signed at the boundary — not self-reported.

AGENTS THRESHOLD UPSTREAM AI Agents Claude · GPT · custom T8 Engine DATA PLANE annotation · policy · credential broker Upstream APIs Google · Notion · Anthropic Enterprise systems Control Plane BACKEND policies · identity · audit Admin Dashboard FRONTEND
_INVERSED · SALON

_Three Pillars

Alignment · Security · Privacy

Verified at runtime, not promised in policy.

Alignment
What the agent claims it did, checked against what the runtime independently observes. Drift caught before consequences land.
Security
Least-privilege per agent, per task. Identity, delegation, and provenance are tamper-evident end to end.
Privacy
Agents process data they cannot read. Cognition and access are structurally decoupled at the runtime layer.
_INVERSED · SALON

_The Question

What does it take to make an agent trustable?

The interesting part isn't bigger models — it's the scaffolding around them. The boundary where intent meets action. The runtime where data meets logic. The interface where humans stay in control without becoming the bottleneck. The economy that emerges when software stops looking like SaaS.

"Post-mortem: how an agent deleted our prod DB." Every credential checked out. Every tool call was authorised. The agent followed instructions — just not the ones we thought we'd given it. That's the failure mode the scaffolding has to prevent — structurally, not by asking nicely.

We've grouped the open problems into three conversations.

_INVERSED · SALON

_Three Conversations

Pick a room.

_01
Architecture of Trust
How do we engineer agents that can't go off-script — even under prompt injection, even when the model is wrong?
Backend · CAPE · DLP
_02
Human Control
Who actually drives these systems? How do CTOs, security, and compliance stay in the loop without drowning in logs?
UX · HITL · Review
_03
The New Economy of Code
If software writes software, what survives? Custom apps, one-person companies, and the safety of dark factories.
SaaS · Dark factories
_INVERSED · SALON

_Room 01

Architecture of Trust

An agent is a loop of cognition + action. Wherever the loop touches the world, that boundary is where trust can be engineered — or lost.

_Backend of agent control
  • Annotations — structured interpretation of activity
  • Powers — programmable permissions, per agent, per task
  • Functions — script-based actions, serverless backend
  • Runtime + credential isolation — the agent never holds the keys
_CAPE — cognition ≠ execution
  • Agent sees only synthetic data; writes a script
  • User signs an execution token; Threshold runs it sandboxed
  • Result streams direct to the user or destination — agent gets OK, nothing else
Private data never enters the context window — GDPR Art. 25 by design.
_Audit & defence
  • Tamper-evident, externally verifiable audit log
  • State of the art in prompt injection and data-loss prevention
Starter question

If the LLM is allowed to be wrong half the time, where in the stack do we put the parts that can't be wrong?

Open thread: how do you protect honest-but-lazy users from themselves — e.g. copy-paste through the trust boundary?

Deep dive — Maller, CAPE: Context-Aware Private Execution · /CAPE.pdf

_INVERSED · SALON

_Room 02

Human Control

A control plane is only as good as the humans who configure it and the humans who review it. Real deployments have CTOs, security officers, compliance, and the operator — each with a different mental model.

_Stakeholders & configuration
  • CTO, security, compliance — different POVs, same backend
  • Permissions that actually match intent
Programmable backend, multiple frontends at different friendliness levels. AI assistance for config drafting and review.
_Reviewing logs
  • Nobody reads them at scale today
First-pass continuous review by AI; humans on exception.
_Human in the loop
  • Async approval flows — how do agent frameworks integrate them?
  • Intent-scoped: by task or category, not by tool call
  • Time-limited capability tokens: agent plans an envelope (10 reads here, 1 write there), human grants it
Starter question

What's the right unit of human approval — a tool call, a task, or a plan?

Open thread: nobody actually reads the logs. What would make them want to?

_INVERSED · SALON

_Room 03

The New Economy of Code

If software generates software, generic SaaS loses its moat. What replaces it — and what does it take to trust a codebase nobody fully read?

_"SaaS is dead"
  • Generic apps losing value
  • Custom apps per company, per small team — disposable software
  • One-person companies; consulting as selling expertise and judgment
_Dark factories
  • Software security when no human wrote the diff
  • Realistic test harnesses, including hold-out scenarios
  • Review, accountability, understanding, maintenance over time
Modular zero-trust architectures, granular blast radius. High-quality, up-to-date, multi-level documentation and changelog. Parallels with corporations and hierarchy.
Starter question

What's the sufficient set of criteria for a dev and an owner to feel safe with code they didn't write?

Open thread: when SaaS dies, what's the smallest unit of software a company still buys?

_INVERSED · SALON

_Format

How the rooms work.

Pick one
Self-sort by the question that nags at you most. Roughly 8–10 people per room.
Open with the question
Each room starts with the starter question on the slide. Disagree out loud. Better questions welcome.
Bring one thing back
At the end, one person from each room shares the sharpest disagreement or the best new question.

30 minutes per round. We reconvene to compare notes.

_INVERSED · SALON

_Over to you

Three rooms. One question each.

_01 · ARCHITECTURE OF TRUST
Where do we put the parts that can't be wrong?
_02 · HUMAN CONTROL
What's the right unit of human approval — call, task, or plan?
_03 · NEW ECONOMY OF CODE
What makes a dev and owner safe with code they didn't write?
_INVERSED
aurel@inversed.ai · inversed.ai
01 / 11