_INVERSED · SALON

An Architecture of
Trustable Agents

Good social behaviors. Hard limits. Creative constraint.

20 minutes · then we split into three rooms.

_INVERSED · SALON

_Who We Are

Inversed

We come from applied cryptography and the security of high-stakes systems — MPC, ZK, biometric identity at global scale (Worldcoin, Aria). We bring that engineering bar to a new problem: making autonomous AI agents safe to deploy.

Aurel — CTO. Background in cryptography-heavy systems. The team scales frontier crypto into production-grade infrastructure for the agentic age.

_INVERSED · SALON

_The Product

Threshold

A control plane for AI agents.

It sits between agents and everything they touch. Credentials never reach the agent. Policies live in the runtime, not in prompts. Every action is annotated and signed at the boundary — not self-reported.

_INVERSED · SALON

_Three Pillars

Alignment · Security · Privacy

Verified at runtime, not promised in policy.

Alignment

What the agent claims it did, checked against what the runtime independently observes. Drift caught before consequences land.

Security

Least-privilege per agent, per task. Identity, delegation, and provenance are tamper-evident end to end.

Privacy

Agents process data they cannot read. Cognition and access are structurally decoupled at the runtime layer.

_INVERSED · SALON

_The Question

What does it take to make an agent trustable?

The interesting part isn't bigger models — it's the scaffolding around them. The boundary where intent meets action. The runtime where data meets logic. The interface where humans stay in control without becoming the bottleneck. The economy that emerges when software stops looking like SaaS.

"Post-mortem: how an agent deleted our prod DB." Every credential checked out. Every tool call was authorised. The agent followed instructions — just not the ones we thought we'd given it. That's the failure mode the scaffolding has to prevent — structurally, not by asking nicely.

We've grouped the open problems into three conversations.

_INVERSED · SALON

_Three Conversations

Pick a room.

_01

Architecture of Trust

How do we engineer agents that can't go off-script — even under prompt injection, even when the model is wrong?

Backend · CAPE · DLP

_02

Human Control

Who actually drives these systems? How do CTOs, security, and compliance stay in the loop without drowning in logs?

UX · HITL · Review

_03

The New Economy of Code

If software writes software, what survives? Custom apps, one-person companies, and the safety of dark factories.

SaaS · Dark factories

_INVERSED · SALON

_Room 01

Architecture of Trust

An agent is a loop of cognition + action. Wherever the loop touches the world, that boundary is where trust can be engineered — or lost.

_Backend of agent control

Annotations — structured interpretation of activity
Powers — programmable permissions, per agent, per task
Functions — script-based actions, serverless backend
Runtime + credential isolation — the agent never holds the keys

_CAPE — cognition ≠ execution

Agent sees only synthetic data; writes a script
User signs an execution token; Threshold runs it sandboxed
Result streams direct to the user or destination — agent gets OK, nothing else

Private data never enters the context window — GDPR Art. 25 by design.

_Audit & defence

Tamper-evident, externally verifiable audit log
State of the art in prompt injection and data-loss prevention

Starter question

If the LLM is allowed to be wrong half the time, where in the stack do we put the parts that can't be wrong?

Open thread: how do you protect honest-but-lazy users from themselves — e.g. copy-paste through the trust boundary?

Deep dive — Maller, CAPE: Context-Aware Private Execution · /CAPE.pdf

_INVERSED · SALON

_Room 02

Human Control

A control plane is only as good as the humans who configure it and the humans who review it. Real deployments have CTOs, security officers, compliance, and the operator — each with a different mental model.

_Stakeholders & configuration

CTO, security, compliance — different POVs, same backend
Permissions that actually match intent

Programmable backend, multiple frontends at different friendliness levels. AI assistance for config drafting and review.

_Reviewing logs

Nobody reads them at scale today

First-pass continuous review by AI; humans on exception.

_Human in the loop

Async approval flows — how do agent frameworks integrate them?
Intent-scoped: by task or category, not by tool call
Time-limited capability tokens: agent plans an envelope (10 reads here, 1 write there), human grants it

Starter question

What's the right unit of human approval — a tool call, a task, or a plan?

Open thread: nobody actually reads the logs. What would make them want to?

_INVERSED · SALON

_Room 03

The New Economy of Code

If software generates software, generic SaaS loses its moat. What replaces it — and what does it take to trust a codebase nobody fully read?

_"SaaS is dead"

Generic apps losing value
Custom apps per company, per small team — disposable software
One-person companies; consulting as selling expertise and judgment

_Dark factories

Software security when no human wrote the diff
Realistic test harnesses, including hold-out scenarios
Review, accountability, understanding, maintenance over time

Modular zero-trust architectures, granular blast radius. High-quality, up-to-date, multi-level documentation and changelog. Parallels with corporations and hierarchy.

Starter question

What's the sufficient set of criteria for a dev and an owner to feel safe with code they didn't write?

Open thread: when SaaS dies, what's the smallest unit of software a company still buys?

_INVERSED · SALON

_Format

How the rooms work.

Pick one

Self-sort by the question that nags at you most. Roughly 8–10 people per room.

Open with the question

Each room starts with the starter question on the slide. Disagree out loud. Better questions welcome.

Bring one thing back

At the end, one person from each room shares the sharpest disagreement or the best new question.

30 minutes per round. We reconvene to compare notes.

_INVERSED · SALON

_Over to you

Three rooms. One question each.

_01 · ARCHITECTURE OF TRUST

Where do we put the parts that can't be wrong?

_02 · HUMAN CONTROL

What's the right unit of human approval — call, task, or plan?

_03 · NEW ECONOMY OF CODE

What makes a dev and owner safe with code they didn't write?

_INVERSED

aurel@inversed.ai · inversed.ai

An Architecture ofTrustable Agents

Inversed

Threshold

Alignment · Security · Privacy

What does it take to make an agent trustable?

Pick a room.

Architecture of Trust

Human Control

The New Economy of Code

How the rooms work.

Three rooms. One question each.

An Architecture of
Trustable Agents