Argus · Pipeline Overview · v1 · 2026-04-23

The hundred-eyed
threat model.

A framework-agnostic, agentic threat-modeling pipeline — built on Cerbero, the Claude Agent SDK, and Trail of Bits' Claude Code skills.

TL;DR — Many specialized agents examine a target from different angles, a small number of advisor agents critique the high-leverage phases, and the pipeline produces a prioritized threats.json plus a narrative report. Trail of Bits' skills do the heavy lifting; the Agent SDK lets agents call them directly.

08 phases

02 fan-outs

04 advisor seats

01 threats.json

Inputs

Manifest

EngagementSpec

YAML pointing at one or more repositories, with optional base refs and secret policies.

Source

Target repositories

Cloned locally; scoped, not sampled. Every finding must cite path:line-line.

Context

Design docs (PDF)

Optional architecture docs, runbooks, deployment facts — text-extracted for actor/boundary mining.

Operator

Engagement framing

Scope, known boundaries, and any operator context that shapes what's worth attacking.

The pipeline

Ingest

ingester · haiku-4-5

Parse the manifest, resolve paths and refs, clone repos, extract PDF text, run a secret scan. Mechanical work — the cheapest model on the roster.

executor secret-scan

outmanifest.resolved.json

Audit context

audit-context-builder · sonnet-4-6

Build deep per-repository context before vulnerability hunting — architecture, module map, trust surfaces. Grounded in files that actually exist in the target.

executor audit-context-building read-only

outphases/02-audit-context/context.md

Concurrent

fan-out

joins before
phase 04

03 · Fan-out

three parallel tracks, one join

Git themes

diff-reviewer · sonnet

Runs when a base ref is declared. Blast radius + regression checks on diffs. Skipped cleanly when no base ref.

differential-review

Input discovery

input-discoverer · sonnet

First-pass source discovery via semgrep, then code-grounded reasoning to prune false positives.

semgrep + advisor

input-discovery-advisor · sonnet

Critiques the source list before it becomes input to phase 04. One bounded rerun allowed.

Arch-doc ingest

arch-ingester · sonnet

Extract stated assets, actors, trust boundaries, and deployment facts from design docs.

executor docs

Taint trace

taint-tracer · sonnet · fan-out per source

Each discovered source gets traced to its terminal sink. Scanners decide applicability; manual, code-grounded tracing fills the gaps. No scanner evidence is ever fabricated.

fan-out ×N sources semgrep codeql sarif-parsing

taint-trace-advisor · opus-4-7

Catches missed hops, ungrounded claims, duplicate paths, impossible flows, unreachable terminal sinks.

Asset · Actor · Boundary

boundary-modeler · sonnet

Unify evidence from phases 02–03 into a single components / assets / actors / boundaries model. This is the map the threat-enumerator fans out over.

executor synthesis

outphases/05-asset-actor-boundary/model.json

Threat enumeration

threat-enumerator · sonnet · fan-out per component

One enumerator per component, producing concrete source-to-sink threats. No STRIDE, no taxonomy — judgment over categories. Every candidate cites path:line-line or the schema rejects it.

fan-out ×N components schema-validated

threat-enumeration-advisor · opus-4-7

Reviews candidate threats component-by-component. Bounded reruns only where precise correction is needed.

Prioritize

prioritizer · opus-4-7 · advisor-shaped

A single advisor-shaped phase. Reads the full threat list and assigns priority + priority_rationale in one pass. Reachability is a dominant input signal.

advisor-shaped reachability-weighted

outthreats.json · threats/<id>.json

Report

reporter · sonnet

Narrative report.md wrapping the prioritized threats — asset table, actor table, trust boundaries, key risk themes. The artifact humans read; threats.json is the artifact tools read.

executor narrative

outreport.md

The executor / advisor pattern

One shot to disagree. One shot to revise.

Most phases run a single executor. Advisors attach only where judgment errors are expensive and easy to miss. The pipeline allows at most one executor rerun per advised phase — so a disagreeing advisor cannot loop the phase indefinitely.

01Executor runs with tools, produces an artifact.
02Advisor reads the artifact plus curated phase context. It has no tools.
03Advisor writes critique.md and structured revision_requests.json — every time, even when the verdict is "looks fine."
04If and only if requests exist, the executor reruns once with previousArtifact + revisionRequest folded into context.

Advisors are paid to disagree, so they get the bigger model: opus-4-7 for taint-trace, threat-enumeration, and prioritize. Executors run on sonnet-4-6, the mechanical ingester on haiku-4-5.

Run · LegalCaseFlow (deliberately vulnerable)

Critical

threats

High

threats

Medium

threats

Low

threats

Total threats

Fan-out · sources

Fan-out · components

Tokens · in / out

137K/680K

Unauthenticated JWT signing-key disclosure

GET /startup-metadata returns auth_secret_key to any internet caller. Three distinct threats reached the sink from different angles.

internet → express → system_settings

SQL injection at login

POST /auth/login interpolates email directly into a WHERE clause. Unauthenticated bypass + full cross-tenant read.

internet → express → postgres (no RLS)

Password reset token leaked in response body

Two unauthenticated requests suffice to take over any account with a known email — including admins.

anon → /auth/reset → response body

Host-exposed PostgreSQL

Docker Compose maps 5432 to the host with hardcoded legalcf/legalcf credentials. Bypasses all application-layer controls.

host:5432 → postgres (direct)

priority · critical

Login endpoint interpolates email directly into SQL WHERE clause

T-sql-injection-at-login-via-unparameterized-email

component

Express REST API (apps/api)

entry

POST /auth/login

boundary

Internet → Express REST API → PostgreSQL

reach

internet

evidence

apps/api/src/services/auth.service.ts:165-190

rationale

Internet-reachable, unauthenticated, direct path to full credential dump.

confidence

high

one record · exactly as it landed in threats.json

Where Argus sits in the loop

Offensive-to-defensive loop, wired as agentic stages — not a twice-a-year calendar item.

Stage 01 · You are here

Argus

Threat-model the target. Produce prioritized backlog.

→

Stage 02

Odysseus

Run the pentest against the prioritized backlog.

→

Stage 03

Human builder

Fix the issues the exploits surfaced. Still a human today.

→

Stage 04

Cassian

Differential review of the remediation PR — fix closed, no regressions.

Argus · built on Cerbero · Claude Agent SDK · Trail of Bits skills

Educational / authorized testing only. Only use these techniques against systems you own or have explicit permission to assess. For production or team builds, switch to the Claude API directly.