BookbagBookbag
Compliance Autopilot

AI runs your AI compliance program — on the loop.

The Autopilot is your autonomous compliance officer. It scopes use cases, drafts policies, runs evaluations, collects evidence, and exports the audit bundle continuously. The work a manual compliance team does in quarters, the Autopilot does every day.

The shift

You don't hire a compliance team. You install one.

The old model: a manual compliance vendor charges $150K–$500K per year to do the same work every quarter. The new model: Autopilot runs it for you, continuously, with every approved trace.

Manual compliance shop
  • ×Quarterly engagement — your runtime moves daily
  • ×Humans transcribe traces into spreadsheets
  • ×Policy authoring takes weeks
  • ×Evidence assembled at audit time, not continuously
  • ×$150K–$500K/year + per-hour overages
Autopilot
  • Runs on the loop — daily by default, on every approved annotation
  • Reads traces directly — no transcription, no spreadsheets
  • Drafts policies from observed runtime gaps in minutes
  • Evidence bundle is continuously fresh, ready any day
  • One platform fee; humans only approve material decisions
The compliance loop

Seven steps, on autopilot, every day

The same loop a senior compliance officer runs by hand — Autopilot runs it for every AI system you ship, on the cadence you set.

1

Discover use cases

Autopilot reads runtime traces from Observe, clusters them into use cases (Customer Support Refunds, Legal Clause Summarizer, etc.), and proposes a risk tier per use case.

2

Pick frameworks

Based on industry, region, and use-case risk tier, it selects the applicable frameworks (EU AI Act Annex III, NIST AI RMF, ISO 42001, SOC 2, HIPAA) and shipped controls.

3

Draft policies

Where Guardrails has coverage gaps, Autopilot drafts policies in plain English with example traces — humans approve before they go live.

4

Run evaluations

It schedules taxonomy-driven Evaluation suites with the staged AI auditor, monitors pass rates, and flags regressions against a baseline.

5

Surface intelligence

Five intelligence jobs run per approved annotation: drift, taxonomy gaps, prompt sensitivity, factuality regression, policy coverage. Findings feed back into draft policies.

6

Loop in humans on judgment calls

Anything material — policy approval, risk acceptance, framework swap, bundle export — gets agent-action gated. Autopilot drafts; humans sign.

7

Export the audit bundle

Signed manifest with controls, traces, policy snapshots, eval results, audit log — one click. Continuously fresh, ready every day.

Live trace, full history

Watch the Autopilot work, replay any run

Every step streams over SSE. Every run is archived. Open last Tuesday's loop, see what it decided, what it deferred, and why.

autopilot · run_2026-05-27_07-04Zstreaming
DISCOVER142 traces clustered into 3 use cases · refund-agent (high), support-agent (med), drafting-agent (med)
FRAMEWORKEU_AI_ACT Annex III(4) applies to refund-agent — selecting 8 controls
DRAFTpolicy:refund.amount.exceeds_threshold drafted — awaiting human approval (gated)
EVALeval:refunds-q3-baseline ran — 0.972 pass rate (+0.04 from last run)
INSIGHTprompt_sensitivity flagged on order_id=FF-* — 11% verdict drift across 18 reruns
EXPORTevidence bundle build queued · 23 controls satisfied · awaiting signature
Next step: human approval on 2 drafted policies (Slack ping sent)
Per-annotation intelligence

Five intelligence jobs per approved annotation

Every time a human approves an annotation, Autopilot fires five jobs in the background — finding the patterns a human reviewer would miss across thousands of traces.

Drift detection

Compares current verdict distribution against rolling baseline. Flags when models silently degrade.

Taxonomy gap discovery

Finds outputs that don't fit any existing taxonomy dimension. Drafts new dimensions for review.

Prompt sensitivity

Reruns the same prompt with controlled perturbations. Flags low-stability verdicts.

Factuality regression

Cross-checks claims against the RAG knowledge base. Flags hallucination patterns by use case.

Policy coverage

Maps every output against Guardrails policies. Finds calls that ran without coverage.

What Autopilot runs on

Autopilot is the brain. The platform is the body.

You can use Observe, Guardrails, Evaluation, and Governance manually — Autopilot just runs them for you, on a continuous loop, with full transparency.

Compliance frameworks the Autopilot runs against

EU AI Act
Annex III controls
NIST AI RMF
Govern · Map · Measure · Manage
ISO 42001
AI management system
SOC 2
Type II evidence
HIPAA
Health AI risk tier

Autopilot FAQs

Frequently Asked Questions

Stop paying compliance consultants by the quarter. Run the program on autopilot.

Join the teams shipping safer AI with real-time evaluation, audit trails, and continuous improvement.