Argus · Pipeline Overview · v1 · 2026-04-23

The hundred-eyed
threat model.

A framework-agnostic, agentic threat-modeling pipeline — built on Cerbero, the Claude Agent SDK, and Trail of Bits' Claude Code skills.
TL;DR — Many specialized agents examine a target from different angles, a small number of advisor agents critique the high-leverage phases, and the pipeline produces a prioritized threats.json plus a narrative report. Trail of Bits' skills do the heavy lifting; the Agent SDK lets agents call them directly.
08 phases
02 fan-outs
04 advisor seats
01 threats.json

Inputs

Manifest
EngagementSpec
YAML pointing at one or more repositories, with optional base refs and secret policies.
Source
Target repositories
Cloned locally; scoped, not sampled. Every finding must cite path:line-line.
Context
Design docs (PDF)
Optional architecture docs, runbooks, deployment facts — text-extracted for actor/boundary mining.
Operator
Engagement framing
Scope, known boundaries, and any operator context that shapes what's worth attacking.

The pipeline

01
Ingest
ingester · haiku-4-5
Parse the manifest, resolve paths and refs, clone repos, extract PDF text, run a secret scan. Mechanical work — the cheapest model on the roster.
executor secret-scan
outmanifest.resolved.json
02
Audit context
audit-context-builder · sonnet-4-6
Build deep per-repository context before vulnerability hunting — architecture, module map, trust surfaces. Grounded in files that actually exist in the target.
executor audit-context-building read-only
outphases/02-audit-context/context.md
Concurrent
fan-out
joins before
phase 04
03 · Fan-out
three parallel tracks, one join
Git themes
diff-reviewer · sonnet
Runs when a base ref is declared. Blast radius + regression checks on diffs. Skipped cleanly when no base ref.
differential-review
Input discovery
input-discoverer · sonnet
First-pass source discovery via semgrep, then code-grounded reasoning to prune false positives.
semgrep + advisor
input-discovery-advisor · sonnet
Critiques the source list before it becomes input to phase 04. One bounded rerun allowed.
Arch-doc ingest
arch-ingester · sonnet
Extract stated assets, actors, trust boundaries, and deployment facts from design docs.
executor docs
04
Taint trace
taint-tracer · sonnet · fan-out per source
Each discovered source gets traced to its terminal sink. Scanners decide applicability; manual, code-grounded tracing fills the gaps. No scanner evidence is ever fabricated.
fan-out ×N sources semgrep codeql sarif-parsing
taint-trace-advisor · opus-4-7
Catches missed hops, ungrounded claims, duplicate paths, impossible flows, unreachable terminal sinks.
05
Asset · Actor · Boundary
boundary-modeler · sonnet
Unify evidence from phases 02–03 into a single components / assets / actors / boundaries model. This is the map the threat-enumerator fans out over.
executor synthesis
outphases/05-asset-actor-boundary/model.json
06
Threat enumeration
threat-enumerator · sonnet · fan-out per component
One enumerator per component, producing concrete source-to-sink threats. No STRIDE, no taxonomy — judgment over categories. Every candidate cites path:line-line or the schema rejects it.
fan-out ×N components schema-validated
threat-enumeration-advisor · opus-4-7
Reviews candidate threats component-by-component. Bounded reruns only where precise correction is needed.
07
Prioritize
prioritizer · opus-4-7 · advisor-shaped
A single advisor-shaped phase. Reads the full threat list and assigns priority + priority_rationale in one pass. Reachability is a dominant input signal.
advisor-shaped reachability-weighted
outthreats.json  ·  threats/<id>.json
08
Report
reporter · sonnet
Narrative report.md wrapping the prioritized threats — asset table, actor table, trust boundaries, key risk themes. The artifact humans read; threats.json is the artifact tools read.
executor narrative
outreport.md

The executor / advisor pattern

One shot to disagree. One shot to revise.

Most phases run a single executor. Advisors attach only where judgment errors are expensive and easy to miss. The pipeline allows at most one executor rerun per advised phase — so a disagreeing advisor cannot loop the phase indefinitely.

  1. 01Executor runs with tools, produces an artifact.
  2. 02Advisor reads the artifact plus curated phase context. It has no tools.
  3. 03Advisor writes critique.md and structured revision_requests.json — every time, even when the verdict is "looks fine."
  4. 04If and only if requests exist, the executor reruns once with previousArtifact + revisionRequest folded into context.

Advisors are paid to disagree, so they get the bigger model: opus-4-7 for taint-trace, threat-enumeration, and prioritize. Executors run on sonnet-4-6, the mechanical ingester on haiku-4-5.

Run · LegalCaseFlow (deliberately vulnerable)

Critical
12
threats
High
27
threats
Medium
24
threats
Low
1
threats
Total threats
64
Fan-out · sources
59
Fan-out · components
19
Tokens · in / out
137K/680K
Unauthenticated JWT signing-key disclosure
GET /startup-metadata returns auth_secret_key to any internet caller. Three distinct threats reached the sink from different angles.
internet → express → system_settings
SQL injection at login
POST /auth/login interpolates email directly into a WHERE clause. Unauthenticated bypass + full cross-tenant read.
internet → express → postgres (no RLS)
Password reset token leaked in response body
Two unauthenticated requests suffice to take over any account with a known email — including admins.
anon → /auth/reset → response body
Host-exposed PostgreSQL
Docker Compose maps 5432 to the host with hardcoded legalcf/legalcf credentials. Bypasses all application-layer controls.
host:5432 → postgres (direct)

Where Argus sits in the loop

Offensive-to-defensive loop, wired as agentic stages — not a twice-a-year calendar item.
Stage 01 · You are here
Argus
Threat-model the target. Produce prioritized backlog.
Stage 02
Odysseus
Run the pentest against the prioritized backlog.
Stage 03
Human builder
Fix the issues the exploits surfaced. Still a human today.
Stage 04
Cassian
Differential review of the remediation PR — fix closed, no regressions.
Argus · built on Cerbero · Claude Agent SDK · Trail of Bits skills
Educational / authorized testing only. Only use these techniques against systems you own or have explicit permission to assess. For production or team builds, switch to the Claude API directly.