Threat model
Adversary classes, trust boundaries, assumed-broken components, and explicit non-goals for the Project-AI governance kernel.
Posture
Project-AI assumes the model is untrusted, the operator may be coerced, and the network is hostile. The governance kernel is the only component permitted to emit ALLOW. Every other component is treated as a potential attacker until proven otherwise by a signed verdict.
Adversary model
We model adversaries by capability, not by intent. Each class below describes what the adversary can do, not who they are.
A1 — Compromised model
IN SCOPEThe model can produce arbitrary output, including outputs crafted to evade post-hoc filters. The kernel must adjudicate the action, not the text.
A2 — Compromised client / prompt injector
IN SCOPEUntrusted input can reach the model via any channel. Capability tokens are bound to the request envelope, not to the prompt.
A3 — Compromised operator workstation
IN SCOPEAn operator endpoint may be malware-controlled. AC4/AC5 actions require multi-party signed approval; a single operator key cannot escalate.
A4 — Network-level adversary
IN SCOPETLS termination, MITM, and replay are assumed possible. All decisions carry signed envelopes and TSA timestamps; replay is bounded by nonce + freshness.
A5 — Insider with kernel signing key
IN SCOPEA single insider holding a kernel signing key can forge ALLOW verdicts. Mitigated by key splitting, rotation, and the public key transparency log at /keys.
A6 — Nation-state with hardware supply-chain access
OUT OF SCOPEPre-compromised silicon or firmware is not defended against by the current kernel. Documented as a known gap.
Trust boundaries
Each boundary below is enforced by a signed envelope. Crossing a boundary without a valid signature is treated as DENY.
Client → Portal
TLS + request signature. Portal does not hold capability authority; it forwards intent to the kernel.
Portal → Kernel
mTLS + envelope signature. The portal cannot synthesize a verdict; it can only request one.
Kernel → Runtime
Capability token issued per action, bound to action hash, expiry, and authority class. Runtime rejects any action whose hash does not match the token.
Runtime → External effects
Each external effect (network, storage, payment, code execution) is its own AC-scoped capability. There is no ambient authority.
Assumed broken
Components we treat as compromised by default. The kernel must remain safe even when all of these are hostile.
Model weights
ASSUMED BROKENTreated as adversarial output generator.
Prompt channel
ASSUMED BROKENTreated as attacker-controlled.
Frontend JavaScript
ASSUMED BROKENTreated as observable and tamperable.
Operator email / chat
ASSUMED BROKENTreated as phishable. Out-of-band signing required for AC4+.
Invariants
Properties that must hold under every adversary in scope. Violation of any invariant triggers SAFE_HALT.
I1 — No silent ALLOW
Every ALLOW produces a signed, externally verifiable receipt.
I2 — No ambient authority
Every capability is explicit, scoped, expiring, and bound to an action hash.
I3 — Deny-by-default
Absence of an ALLOW is DENY. There is no permissive fallback.
I4 — Continuity gate
STATE_REGISTER mismatch produces SAFE_HALT, not silent recovery.
I5 — Public verifiability
Every signing key used for live verdicts is published at /keys with a rotation log.
Explicit non-goals
Things this threat model deliberately does not address. Listed so reviewers do not assume coverage we have not claimed.
Hardware supply chain
OUT OF SCOPEPre-compromised CPUs, TPMs, or HSMs.
Side-channel attacks on the kernel host
OUT OF SCOPESpectre-class and power-analysis attacks.
Model alignment
OUT OF SCOPEProject-AI does not claim the model is aligned. It claims the model cannot execute unsafe actions.
Legal admissibility in every jurisdiction
OUT OF SCOPEAdmissibility is claimed against the published frame, not against arbitrary courts.
Disclosure
Vulnerabilities affecting any IN SCOPE adversary class are eligible for coordinated disclosure. Use /disclosure for the coordinated vulnerability policy and /.well-known/security.txt for machine-readable RFC 9116 contact metadata.