# 12 — AI-Assisted Algorithm Builder

> **Status:** ✅ SHIPPED (Sprints 1-4 complete as of 2026-04-17). Enhancements shipped 2026-04-20.
> **Successor doc:** [13-broker-integration.md](./13-broker-integration.md) covers the live-broker layer that plugs into this builder.
> **Enhancement doc:** [32-algorithm-enhancements.md](./32-algorithm-enhancements.md) — live metrics, risk controls, templates, AI suggestions, advanced backtesting, leaderboard, graduation.

A composable, LLM-augmented strategy builder where users can compose, backtest, paper-trade, and (eventually) live-trade strategies — with an LLM co-pilot that helps author, critique, and adaptively tune them under hard safety constraints.

---

## Operating Principles

1. **Risk-first, alpha-second.** Hard limits are inviolable, not parameters. The kill switch can never be disabled by the algo or by the LLM.
2. **Every change is auditable.** Self-modification is allowed only inside named "trust scopes." Every diff is logged with the LLM's reasoning, the metric that triggered it, and a one-click revert.
3. **The LLM is a co-pilot, not the captain.** Three modes:
   - **Mode 1**: human-only (LLM analyses, never writes)
   - **Mode 2**: LLM-proposes-human-approves (default for live)
   - **Mode 3**: LLM-autonomous-within-sandbox (opt-in per strategy + per parameter range)
4. **Backtest before paper, paper before live.** No strategy reaches `mode=live` without ≥30 days of paper trading + a passing walk-forward + a passing regime stress test.
5. **Composability beats cleverness.** Strategies are graphs of well-tested primitives, not monolithic functions.
6. **Anti-hallucination is built-in.** Every LLM output is schema-validated, cite-or-refuse for data claims, cross-validated against deterministic engines, and adversarial-reviewed before it can affect a live system.

---

## Architecture (11 Modules)

```
┌─────────────────────────────────────────────────────────────┐
│ 1. AUTHORING SURFACES                                        │
│    Visual Builder | Natural Language | DSL Editor            │
└──────────────────────────┬──────────────────────────────────┘
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ 2. COMPILER → CANONICAL AST                                  │
│    All three surfaces compile to the same JSON AST           │
└──────────────────────────┬──────────────────────────────────┘
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ 3. LLM PRE-FLIGHT CRITIQUE                                   │
│    Look-ahead bias · survivorship bias · overfitting flags   │
│    Risk narration: "your max DD scenario is..."              │
└──────────────────────────┬──────────────────────────────────┘
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ 4. SIGNAL DSL ENGINE (sandboxed evaluator)                   │
│    Pure expressions, deterministic, no I/O during eval       │
└────────┬────────────────────────────────────────┬───────────┘
         ▼                                        ▼
┌──────────────────────┐                ┌──────────────────────┐
│ 5. BACKTESTER        │                │ 6. LIVE EXECUTOR     │
│   Historical replay  │                │   Real-time data     │
│   Walk-forward       │                │   Broker adapter     │
│   Cost + slippage    │                │   Per-trade preflight│
│   Regime stress      │                │   LLM monitor loop   │
└──────────┬───────────┘                └──────────┬───────────┘
           └────────────────┬───────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│ 7. GUARDRAIL LAYER (cannot be disabled by algo or LLM)       │
│    Account-level kill switches · Per-strategy circuit-breakers│
│    Manual panic button · Time/news blackouts                  │
└──────────────────────────┬──────────────────────────────────┘
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ 8. SELF-MODIFICATION CONTROLLER                              │
│    Mode 1/2/3 · Diff queue · Auto-revert on degradation      │
└──────────────────────────┬──────────────────────────────────┘
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ 9. TELEMETRY + ATTRIBUTION                                   │
│    Per-signal alpha contribution · Drawdown analysis         │
│    LLM failure narration · User feedback ingestion           │
└──────────────────────────┬──────────────────────────────────┘
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ 10. LEARNING LOOP                                            │
│     Bayesian param search · LLM strategy suggestions         │
│     Continuous critique → next-iter proposals                │
└──────────────────────────┬──────────────────────────────────┘
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ 11. ANTI-HALLUCINATION GATEWAY                               │
│     ALL LLM outputs flow through this module before they     │
│     affect anything. See section below.                      │
└─────────────────────────────────────────────────────────────┘
```

---

## The Signal DSL

A pure-expression language with statically-typed primitives. **Sandboxed via a custom evaluator** (NOT `eval` or `Function`). Each expression compiles to a tree the executor walks per tick.

### Example 1 — Sentiment + Technical Entry

```yaml
strategy: "NVDA sentiment squeeze"
mode: paper
universe: [NVDA]
signals:
  bullish:
    - rsi(14) < 35
    - sentiment(reddit, 24h) > 0.6
    - vix() < 20
    - earnings_blackout() == false
  bearish:
    - rsi(14) > 75
    - position_pnl_pct() > 8
    - position_age_hours() > 72
entry:
  when: all(bullish)
  size: kelly(edge=0.04, max_position_usd=5000)
exit:
  when: any(bearish)
risk:
  max_position_usd: 5000
  max_daily_loss_pct: 2
  stop_loss_pct: 3
  trailing_stop_pct: 1.5
```

### Example 2 — Pairs Trade with Regime Filter

```yaml
strategy: "BTC/ETH pairs"
mode: live
universe: [BTC, ETH]
signals:
  long_btc_short_eth:
    - zscore(spread(price(BTC), price(ETH)*ratio(BTC,ETH,30d))) < -2
    - vix() < 30
    - llm_confirms("market regime is mean-reverting, not trending", confidence_min=0.7)
risk:
  max_pair_notional: 10000
  correlation_floor: 0.6
```

### Primitive Categories (47+ primitives in `dsl/types.ts`)

| Category | Examples |
|---|---|
| Price/volume | `price(sym)`, `vwap(sym, window)`, `volume_z(sym, window)` |
| Technicals | `rsi`, `macd`, `bb_upper`, `bb_lower`, `atr`, `obv`, `adx`, `zscore` |
| Sentiment | `sentiment(reddit\|news, window)`, `whale_activity(sym)`, `funding_rate(sym)`, `human_automation_score(sym)` |
| Macro | `vix()`, `treasury_2y10y_spread()`, `fed_blackout()`, `cpi_release_today()`, `curve_slope_2s10s()`, `credit_ratio_hy_ig()`, `move_index()`, `real_yield_10y()`, `breakeven_5y5y()` |
| Macro Overlay | `paradox_severity()`, `paradox_active(type)`, `narrative_confidence()`, `geopolitical_intensity()`, `surprise_index()`, `regime_mismatch()`, `narrative_theme(keyword)` |
| Position state | `position_pnl_pct()`, `position_age_hours()`, `account_drawdown_pct()`, `drawdown_from_peak()`, `underwater_bars()` |
| Sizing | `kelly(win_prob, payoff_ratio, max_position_usd)`, `kelly_adaptive(...)`, **`kelly_regime_aware(win_prob, payoff_ratio, max_position_usd)`** ← NEW: auto-scales Kelly by regime (BULL=0.5, SIDEWAYS=0.4, RECOVERY=0.35, BEAR=0.3, CRISIS=0.25), `fixed_usd(amount)`, `risk_pct(pct)` |
| Crypto Edge | `pump_dump_detected(sym)`, `funding_rate_extreme(sym)`, `liquidation_cascade(sym)`, `fear_greed_index()` |
| Price-Action | `fvg_bullish()`, `fvg_bearish()`, `hammer()`, `doji()`, `engulfing_bullish()`, `double_bottom()`, `bull_flag()` |
| Pattern Detection | `pattern_hit_rate(type, direction)`, `pattern_is_significant(type)`, `pattern_proximity(type)`, `pattern_active(type, sym)` |
| DTW Templates | `dtw_template_match(threshold)`, `dtw_template_similarity()`, `dtw_template_direction()`, `dtw_template_name()` |
| Prediction Markets | `prediction_market_prob(slug)`, `fed_rate_cut_prob()`, `recession_prob()`, `election_uncertainty()` |
| Paradox Analysis | `paradox_severity()`, `narrative_confidence()`, `geopolitical_intensity()`, `surprise_index()` |
| LLM | `llm_confirms("…", confidence_min=N)` — sandboxed read-only context |
| Composers | `all`, `any`, `xor` |

---

## Self-Modification Modes

Three trust modes per strategy. **Live strategies cannot start in Mode 3.**

### Mode 1 — Manual Only (default for new strategies)
LLM can analyse and *report*, can't change anything.

### Mode 2 — LLM-Proposed, Human-Approved (default for live)
LLM watches telemetry, queues proposed modifications as PRs. Each PR shows:
- Current parameter → proposed value
- Trigger metric (e.g., "30-day Sharpe dropped from 1.4 → 0.6")
- LLM reasoning (1 paragraph, schema-validated)
- Backtest delta on last 90 days (auto-run before queueing — actual numbers, not LLM-claimed)
- Risk narration: worst-case drawdown impact (deterministic computation)
- Human clicks Approve / Reject / Approve-with-edits

### Mode 3 — LLM-Autonomous Within Sandbox
User explicitly bounds what the LLM can change without asking:

```yaml
self_modify:
  mode: autonomous
  bounds:
    - param: rsi_oversold_threshold
      range: [25, 40]
      max_change_per_24h: 2
    - param: position_size_kelly_fraction
      range: [0.1, 0.3]
      max_change_per_7d: 0.05
  forbidden_changes:
    - all guardrail values (max_daily_loss_pct, stop_loss_pct, etc.)
    - asset universe additions
    - mode escalation (paper→live always requires human)
  rollback_trigger:
    - if rolling_3d_pnl < -2% → revert all autonomous changes from last 7d
```

---

## Stop-Trading Hierarchy (5 Redundant Layers)

| Layer | Examples | Latency | Override |
|---|---|---|---|
| **L1: Hard account guardrails** | Daily loss ≥ 2%, account DD ≥ 10%, leverage > 2x, **gross exposure ≥ 200%** (hard), **net exposure ≥ 100%** (soft), **intraday VaR ≥ 5%** (soft, 15-min checks) | <1s | None — must be reset by user via UI with confirmation |
| **L2: Per-strategy circuit-breakers** | Consecutive losses ≥ 5, Sharpe degradation > 50%, regime mismatch | <5s | Human can override after review |
| **L3: Time/event blackouts** | Earnings ±2h, FOMC ±1h, market close +/-15min, news event detected | Pre-scheduled or event-driven | Manual override via UI |
| **L4: LLM-detected anomalies** | Slippage > 3x expected, fill latency > 5s, signal confidence drop, "this looks like a flash crash" | <30s | Auto-resumes when condition clears + LLM confirms (with anti-hallucination gate) |
| **L5: Manual** | Panic button (kills everything), per-strategy pause, per-symbol pause | Instant | — |

**New L1 guardrails (added 2026-05-04 from expert audit):**
- `max_gross_exposure_pct`: 200% (hard) — prevents >2x leverage via sum of absolute position values
- `max_net_exposure_pct`: 100% (soft) — prevents excessive directional exposure
- `max_intraday_var_pct`: 5% (soft) — historical simulation VaR checked every 15 minutes

**Critical invariant:** L1 and L5 cannot be modified or disabled by any LLM autonomous action. Ever. They live in `algorithm_guardrails` with `hard=true` and the SQL update path checks `OLD.hard = false` before allowing change.

### Pre-trade sizing checks (run inside the executor's entry path)

The hierarchy above kills *running* strategies. Two additional sizing-time
guards shrink (or refuse) orders before they leave the executor:

| Check | Purpose | Source |
|---|---|---|
| **Slippage-aware sizing** | Caps order size where expected slippage > `edgeBps / 3`. Self-calibrating per (symbol, asset_class) via online linear regression on `algorithm_trades.slippage_bps`. | `algorithms/guardrails/slippage-fitter.ts` |
| **Multi-strategy correlation haircut** | Shrinks position by `√(effN/N)` when the user has other active runs on highly-correlated symbols (avg pairwise r ≥ 0.3). Both paper and live runs count toward concentration. | `algorithms/guardrails/correlation-haircut.ts` |
| **Max participation rate** (added 2026-05-04) | Orders capped at % of ADV: stocks 3%, ETFs 5%, crypto 2%, forex 10%. Prevents detection and adverse selection. | `trading/services/trade-execution.ts` |

Both decisions are persisted into `algorithm_trades.signal_snapshot_json`
so audit can see exactly how a trade was sized, even after the run ends.

### LLM end-of-day live critic (added 2026-04-27)

Every UTC day, the `eod-critic-sweep` scheduler job evaluates every active
live run that hasn't been seen today. For each run it:

1. Pulls the last 24h of `algorithm_telemetry` + `algorithm_trades`
2. Asks Sonnet (t=0) to either `sign_off` or `escalate`
3. On escalate, routes a modification proposal through the **same** jury
   pipeline used for ad-hoc modification requests (proposer/critic/judge,
   0.85 confidence gate, 90-day cross-validation backtest).

The critic can never directly modify a strategy — it can only ask the jury
to consider one. Decisions and linked jury verdicts are persisted to
`predict.algorithm_eod_critic` and surfaced in the admin Audit modal.

---

## LLM Roles at Each Stage

| Stage | LLM job | Model |
|---|---|---|
| Authoring | NL → DSL translation, validation, suggestions | Claude Opus (cheap, called occasionally) |
| Pre-flight critique | Look-ahead/survivorship/overfitting detection, risk narration | Claude Opus |
| Live monitoring | Per-tick "does this trade still make sense given current regime?" | Claude Haiku (fast, cheap, called every 1–5min in live mode) |
| Failure analysis | When P&L drops, generate a 1-paragraph explanation of which signals failed | Claude Opus |
| Modification proposals | Generate diff + reasoning + backtest result | Claude Opus |
| Adaptive tuning | Bayesian search over allowed param ranges, LLM picks promising regions | Hybrid: scipy/optuna + Claude Opus for direction |

All LLM calls go through the existing `LLMRouter` so usage tracking + per-org budgets work.

---

## Anti-Hallucination Gateway (Module 11)

This is the critical safety layer. **Every LLM output that affects state must pass through this gateway.**

### 9 Defenses

#### 1. Schema-Validated Outputs Only
Every LLM response is parsed against a Zod schema. Malformed output → reject + re-prompt with the parse error. Three retries max, then escalate to human queue.

```ts
const ProposalSchema = z.object({
  param: z.enum(ALLOWED_PARAMS),
  oldValue: z.number(),
  newValue: z.number(),
  reason: z.string().min(50).max(500),
  triggerMetric: z.string(),
  confidence: z.number().min(0).max(1),
});
const result = ProposalSchema.safeParse(llmJson);
if (!result.success) return reject('schema');
```

#### 2. Whitelist of Primitives
LLM-generated DSL is parsed by the deterministic parser. Unknown primitives or function names → parse error → re-prompt with the error message.

#### 3. Cite-or-Refuse for Data Claims
When the LLM makes a claim about market data ("VIX is 18.4, suggesting low vol regime"), it MUST cite the data point with timestamp + source. The gateway verifies the citation against the actual data snapshot.

```ts
// LLM output:
// "Per snapshot[2026-04-17T10:30:00Z].vix.value = 18.4, regime is..."
// Gateway checks: does snapshot[ts].vix.value == 18.4?
// If no → reject. LLM must re-call with correct data.
```

#### 4. Cross-Validation Against Deterministic Engines
Any LLM-claimed backtest result is **re-run** by the deterministic backtester before being shown to the user. If LLM says "Sharpe 1.8" and the engine computes 0.6, the LLM number is dropped and a "hallucination flag" is logged.

#### 5. LLM-Jury (Proposer / Critic / Judge)
For any decision that affects state — modifications to Mode-2 live strategies, paper→live escalation, autonomous Mode-3 actions — we run a **three-role LLM jury** instead of trusting one model:

```
                    PROPOSER LLM
                  (Claude Opus, t=0.4)
                  ┌────────────┐
                  │ Generates  │
                  │ proposal + │
                  │ reasoning  │
                  └─────┬──────┘
                        ▼
              ┌─────────────────────┐
              │ CRITIC LLM           │
              │ (different config:   │
              │  Opus, t=0.5,        │  ← Higher temp to find edge cases
              │  adversarial prompt) │
              │                      │
              │ "What's wrong with   │
              │  this proposal?      │
              │  Look for hidden     │
              │  risks, look-ahead   │
              │  bias, regime traps, │
              │  fat tails..."       │
              └──────────┬───────────┘
                         │
              ┌──────────┴───────────┐
              │ BLIND HANDOFF        │
              │ Judge receives ONLY: │
              │ • Critical issues: N │
              │ • Warning issues: N  │
              │ • Info issues: N     │
              │ • Critic confidence  │
              │ (NOT the text!)      │
              └──────────┬───────────┘
                         ▼
                ┌────────────────────┐
                │ JUDGE LLM            │
                │ (Haiku, t=0.1,       │  ← Near-deterministic
                │  forced JSON output) │
                │                      │
                │ Given proposer's     │
                │ proposal + critic    │
                │ STATISTICS, return:  │
                │   ACCEPT / REJECT /  │
                │   ESCALATE_HUMAN     │
                │ + 1-line reason      │
                └──────────┬───────────┘
                           ▼
            ┌──────────────────────────┐
            │ Decision routing         │
            │ ACCEPT → applies (Mode 3)│
            │         / queues (Mode 2)│
            │ REJECT → log + drop      │
            │ ESCALATE → human review  │
            └──────────────────────────┘
```

**Why three roles, not two:** A simple two-call consensus has a known weakness — both calls can hallucinate the same way (especially same model, same prompt). Splitting into adversarial roles forces structurally different outputs:

- **Proposer** is creative (higher temperature, generative prompt)
- **Critic** is destructive (zero temperature, "find what's wrong" prompt — explicitly told to be adversarial, not helpful)
- **Judge** is mechanical (zero temperature, forced JSON schema, given both prior outputs verbatim, no creative freedom)

**Diversity rules to defeat correlated hallucination:**
- Use different temperatures (0.4 / 0.5 / 0.1) — updated 2026-05-04 from expert audit
- Optionally rotate providers (Anthropic for proposer, OpenAI for critic if `OPENAI_API_KEY` is set)
- Use different models within Anthropic (Opus → Opus → Haiku)
- Different system prompts (cooperative vs adversarial vs judicial)
- **BLIND JURY PATTERN** (added 2026-05-04): Judge sees ONLY summary statistics from the Critic (critical/warning/info issue counts + confidence), NOT the Critic's narrative text. This prevents anchoring bias where the Judge is influenced by how the Critic framed concerns.

**Escalation triggers** that bypass autonomous action:
- Critic flags ≥1 "critical" issue
- Judge returns `ESCALATE_HUMAN`
- Any of the three calls returns confidence < 0.85
- Proposer and Critic disagree on a numerical fact (cite-or-refuse cross-check)

**Cost gate:** the LLM-jury fires only on state-changing decisions. Read-only analyses (failure narration, telemetry summaries) use a single Opus call. Per-tick monitoring uses single Haiku.

**Audit trail:** all three model outputs are logged to `predict.algorithm_llm_jury` with prompt hashes, model IDs, temperatures, and the final decision. We can replay any decision later.

#### 6. Confidence Self-Reporting + Threshold
LLM must self-report confidence in [0,1]. Below threshold (default 0.7 for proposals, 0.85 for live) → automatic human-review queue, no autonomous action.

#### 7. Hallucination Logging + Pattern Detection
Every rejected LLM output is logged to `predict.algorithm_llm_rejections` with: model, prompt hash, output, rejection reason, timestamp. Aggregated weekly to spot patterns ("model X hallucinates VIX values 3x more than model Y").

#### 8. Sandboxed Read-Only Data Snapshots
LLMs never see "current" data — they see a frozen snapshot at a known timestamp, passed in the prompt context. The LLM cannot make up "real-time" prices because there's no live channel exposed to it.

#### 9. Adversarial Review Prompt
For every proposed modification, a **second LLM call** runs with the explicit prompt: *"Find what's wrong with this proposal. Look for: hidden risks, look-ahead bias, regime fragility, capacity issues, correlation traps, fat-tail underestimation."* Critical issues found → proposal blocked.

### Output Caps (mechanical safety net even if all 9 fail)
- Mode 3 autonomous: max 1 modification per parameter per `max_change_per_24h` window
- Position size proposals: clamped to ≤2x current
- Stop-loss proposals: cannot widen by more than 25%
- Universe additions: forbidden in Mode 3 (always human)

---

## Database Schema (Migration `032_algorithm_builder.sql`)

```sql
predict.user_algorithms          -- id, user_id, name, current_version_id, mode, created_at
predict.algorithm_versions       -- id, algorithm_id, ast_json, dsl_yaml, created_at,
                                 -- created_by ('user'|'llm-proposed'|'llm-autonomous')
predict.algorithm_modifications  -- id, version_id, parent_version_id, diff_json,
                                 -- llm_reasoning, trigger_metric, trigger_value,
                                 -- status ('queued'|'approved'|'rejected'|'reverted'),
                                 -- reviewed_by, llm_confidence, second_llm_agreed
predict.algorithm_runs           -- id, algorithm_id, version_id,
                                 -- mode ('backtest'|'paper'|'live'),
                                 -- started_at, ended_at, kill_reason
predict.algorithm_trades         -- id, run_id, signal_snapshot_json,
                                 -- decision_reason, executed_price, slippage, pnl
predict.algorithm_guardrails     -- id, user_id, scope ('account'|'strategy'),
                                 -- kind, threshold, hard (bool — uneditable when true)
predict.algorithm_kills          -- id, run_id,
                                 -- triggered_by ('manual'|'guardrail'|'llm'),
                                 -- reason, ts
predict.algorithm_llm_rejections -- id, model, prompt_hash, output_text,
                                 -- rejection_reason, rejection_layer, ts
predict.algorithm_llm_jury       -- id, decision_id (links to algorithm_modifications
                                 -- or algorithm_kills), proposer_model, proposer_output,
                                 -- critic_model, critic_output, judge_model, judge_decision,
                                 -- judge_reason, ts
predict.algorithm_telemetry      -- id, run_id, ts, equity, drawdown_from_peak,
                                 -- open_positions_count, signal_alpha_attribution_json
```

---

## Backtest UI

Existing `/api/predict/v1/backtest/` is minimal. The builder needs:

| Feature | Status | Effort |
|---|---|---|
| Historical replay over chosen window | Partial | M |
| Walk-forward validation (train/test sliding window) | Missing | M |
| Regime stress test (replay through 2018, 2020, 2022 separately) | Missing | M |
| Slippage + cost model integration | Have `calculateSlippage` now | S |
| Performance dash: Sharpe, Sortino, Calmar, max DD, win rate, avg trade, Kelly | Partial | M |
| Per-signal contribution (which signal added/subtracted alpha) | Missing | L |
| Side-by-side strategy comparison | Missing | M |
| LLM-narrated post-mortem ("why did this strategy underperform Q3 2022?") | Missing | S |

---

## Build Plan — 4 Sprints

### Sprint 1: Foundation (1 week)
- Migration `032_algorithm_builder.sql` (8 new tables above)
- `signal-dsl/` package: parser, AST type, sandboxed evaluator, ~30 primitives
- `algorithm-service/`: CRUD on user_algorithms, version pinning
- `/admin/algorithms` skeleton page (list, create, view AST)
- **Anti-hallucination defenses 1, 2, 7 wired** (schema validation, whitelist, rejection logging)
- **Ship signal**: User can save a strategy + view its AST. No execution yet.

### Sprint 2: Backtest + Authoring (1.5 weeks)
- Backtest engine extension: walk-forward, regime stress, cost model
- LLM pre-flight critique endpoint (`POST /algorithms/:id/critique`)
- NL→DSL translator (`POST /algorithms/translate`)
- DSL editor UI with live syntax highlighting + critique inline
- Backtest results page with the 8 metrics table + equity curve + drawdown chart
- **Anti-hallucination defenses 3, 4, 9 wired** (cite-or-refuse, cross-validation, adversarial review)
- **Ship signal**: User can write strategy in YAML or English, get LLM critique, run backtest, see results.

### Sprint 3: Paper Trading + Guardrails (1.5 weeks)
- Live executor (paper mode) — connects to existing price service, evaluates DSL on tick, records trades to `algorithm_trades`
- Guardrail layer L1 + L5 (account hard limits + panic button); L3 blackouts via existing event calendar
- Per-strategy dashboard with live equity curve, current positions, recent trades, kill button
- **Anti-hallucination defense 8 wired** (snapshotted read-only context for LLM)
- **Ship signal**: User runs strategy in paper mode for ≥30 days. Guardrails fire correctly in adverse-event tests.

### Sprint 4: Self-Modification + Live-Ready (2 weeks)
- L2/L4 circuit breakers + LLM monitor loop
- Modification controller with all 3 modes
- Diff queue UI (proposed changes list, approve/reject buttons)
- Backtest-on-modification (auto-runs 90d backtest before queueing any LLM proposal)
- Live trading adapter interface (concrete broker integration is doc 13's scope)
- Audit log viewer
- **Anti-hallucination defenses 5, 6 wired** (two-LLM consensus, confidence threshold)
- **Ship signal**: A Mode-2 strategy receives an LLM modification proposal, user approves it, change is applied, telemetry shows behaviour change.

**Total**: ~6 weeks. Live broker integration is doc 13.

---

## Risks I'd Watch

1. **LLM hallucination in DSL translation.** Mitigation: defense 1 (schema validation) + defense 2 (whitelist).
2. **LLM hallucinated backtest results.** Mitigation: defense 4 (cross-validation against deterministic engine).
3. **Look-ahead bias in backtest.** Mitigation: data engine timestamps every tick with `as_of_time` and rejects any signal expression that references future data.
4. **Survivorship bias in symbol universe.** Mitigation: backtest universe is point-in-time S&P 500 / coin list as of the test date.
5. **Self-modification spiral.** Mitigation: hard cap on autonomous changes per 24h, mandatory rollback if 3d P&L drops below threshold, every change is one-click revertable.
6. **LLM-monitor loop runaway cost.** Mitigation: per-tick LLM call uses Haiku and is rate-limited to once per 5min/strategy max in live mode.
7. **LLM proposal spam.** Mitigation: defenses 5 + 6 (two-LLM consensus + confidence threshold).
8. **Fat-tail underestimation.** Mitigation: regime stress test must include 2008, 2020-Mar, 2022-Jun before paper→live promotion.

---

## Success Metrics (Definition of Done)

After Sprint 4 ships, we should be able to:

- [ ] User describes a strategy in plain English → LLM produces valid DSL → critique surfaces look-ahead bias warning → user fixes → backtest passes
- [ ] User saves strategy in Mode 2 → LLM proposes a modification after 30d telemetry → diff queue shows two-LLM-consensus + confidence > 0.85 → user approves → change applied with audit log
- [ ] User pulls panic button → all strategies halt within 1 second → no LLM action can re-enable
- [ ] Adversarial test: ask LLM to make up a non-existent primitive → defense 2 rejects → re-prompt → corrected output succeeds
- [ ] Hallucination dashboard shows weekly rejection rates per defense layer

---

## Out of Scope (Explicitly)

- Real broker integration → see [13-broker-integration.md](./13-broker-integration.md)
- Multi-leg options strategies → v2
- Cross-asset margin optimisation → v2
- Algorithmic market-making → v2
- User-shareable strategy marketplace → v2
