# AI & LLM System

> Documentation for Agencio Predict's AI/LLM infrastructure, including Core API Keys, BYOK (Bring Your Own Key), usage tracking, and metering.

**See also:** [41-ai-api-keys-feature-matrix.md](./41-ai-api-keys-feature-matrix.md) — Quick-reference feature matrix comparing Platform Keys vs BYOK, subscription tier gating, and cost rates.

## Overview

Agencio Predict uses a tiered AI key system:

1. **Core API Key** — Platform-level keys that power the application's AI capabilities
2. **Organization BYOK** — Optional user-provided keys for organizations wanting to use their own API quotas
3. **Usage Tracking** — Per-user and per-organization tracking of all AI API calls
4. **Metering & Billing** — Integration with subscription limits and overage handling

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                         AI/LLM KEY HIERARCHY                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │                         CORE API KEYS                                   │ │
│  │                    (Platform-Level / System)                           │ │
│  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │ │
│  │  • Powers: All-Seeing Eye, AI Engine, Algorithm Builder               │ │
│  │  • Env Vars: LLMROUTER_*, CLAUDE_API_KEY, OPENAI_API_KEY              │ │
│  │  • Cost: Absorbed by platform (included in subscription)              │ │
│  └────────────────────────────────────────────────────────────────────────┘ │
│                                   │                                          │
│                                   ▼                                          │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │                     ORGANIZATION BYOK (Optional)                        │ │
│  │                    (Customer-Provided Keys)                            │ │
│  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  │ │
│  │  • When: Organization wants own quota/billing relationship            │ │
│  │  • Storage: AES-256-GCM encrypted in predict.organization_ai_settings │ │
│  │  • Cost: Billed directly to customer's API account                    │ │
│  └────────────────────────────────────────────────────────────────────────┘ │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
```

## Core API Keys (Platform-Level)

The platform uses **Core API Keys** as the "main brain" of the application. These keys are configured via environment variables and power all AI features:

### Configuration

```bash
# Primary: LLM Router (recommended - provides load balancing, caching, fallbacks)
LLMROUTER_BASE_URL=https://llmrouter.agencio.ai
LLMROUTER_API_KEY=sk-router-...

# Fallback: Direct Anthropic
CLAUDE_API_KEY=sk-ant-...
# or
ANTHROPIC_API_KEY=sk-ant-...

# Optional: Direct OpenAI (for specific features)
OPENAI_API_KEY=sk-...
```

### Provider Resolution Order

The LLM client (`packages/be/src/algorithms/llm/client.ts`) resolves providers in this order:

1. **LLM Router** (if `LLMROUTER_BASE_URL` + `LLMROUTER_API_KEY` are set)
2. **Anthropic Direct** (if `CLAUDE_API_KEY` or `ANTHROPIC_API_KEY` is set)
3. **Error** (if nothing is configured)

### Model Tiers

| Tier | Model | Use Case |
|------|-------|----------|
| `opus` | claude-sonnet-4-20250514 | Complex reasoning, algorithm critique, jury decisions |
| `haiku` | claude-haiku-4-5-20251001 | Fast tasks, translation, simple validation |

### Core AI Services Powered

| Service | Location | Purpose |
|---------|----------|---------|
| All-Seeing Eye | `packages/be/src/all-seeing-eye/` | Unified AI orchestration, black swan detection |
| AI Engine | `packages/be/src/ai-engine/` | Self-learning prediction engine |
| Algorithm Builder | `packages/be/src/algorithms/` | DSL evaluation, LLM critique, jury system |
| AI Watch | `packages/be/src/watch/` | Real-time monitoring and alerts |
| Marketing Predictor | `packages/be/src/marketing/` | Campaign performance forecasting |

## Organization BYOK (Bring Your Own Key)

Organizations can optionally provide their own API keys for:
- **Direct billing relationship** with OpenAI/Anthropic
- **Dedicated quota** not shared with other platform users
- **Enterprise compliance** requirements

### Storage & Security

| Aspect | Implementation |
|--------|----------------|
| Encryption | AES-256-GCM |
| Salt | Derived via `crypto.scryptSync()` |
| Location | `predict.organization_ai_settings` table |
| IV | 16 bytes random per encryption |
| Format | `iv:authTag:encryptedData` |

```typescript
// Key encryption (from ai-settings-service.ts)
const ALGORITHM = 'aes-256-gcm';
const ENCRYPTION_KEY = process.env.AI_SETTINGS_ENCRYPTION_KEY;

function encrypt(text: string): string {
  const iv = crypto.randomBytes(16);
  const key = crypto.scryptSync(ENCRYPTION_KEY, 'salt', 32);
  const cipher = crypto.createCipheriv(ALGORITHM, key, iv);
  // ... returns iv:authTag:encrypted
}
```

### Access Control

| Role | Can Use Keys | Can View Keys | Can Edit Keys |
|------|-------------|---------------|---------------|
| Owner | ✅ | ✅ (masked) | ✅ |
| Admin | ✅ | ✅ (masked) | ✅ |
| Member | ✅ | ❌ | ❌ |
| Viewer | ✅ | ❌ | ❌ |

### API Endpoints

| Method | Endpoint | Purpose |
|--------|----------|---------|
| GET | `/api/predict/v1/user/ai-settings` | Get current AI settings |
| PUT | `/api/predict/v1/user/ai-settings` | Update AI settings (keys, model) |
| DELETE | `/api/predict/v1/user/ai-settings/keys/:type` | Delete specific API key |
| POST | `/api/predict/v1/user/ai-settings/validate` | Validate an API key |

### Settings UI

Users configure BYOK at `/settings` → **AI Settings**:

1. **Provider Selection**
   - `agencio` — Use platform keys (default)
   - `custom` — Use organization's own keys

2. **API Key Entry**
   - OpenAI API Key (optional)
   - Anthropic API Key (optional)

3. **Model Preference**
   - `auto` — Let system choose
   - Specific model selection

## Usage Tracking

All AI API calls are tracked at both user and organization levels.

### What Gets Tracked

| Field | Description |
|-------|-------------|
| `user_id` | User who initiated the request |
| `organization_id` | Organization context |
| `model` | Model used (e.g., `gpt-4o`, `claude-3-opus`) |
| `provider` | Provider (`agencio`, `openai`, `anthropic`) |
| `prompt_tokens` | Input tokens consumed |
| `completion_tokens` | Output tokens generated |
| `total_tokens` | Sum of input + output |
| `estimated_cost_cents` | Calculated cost based on model rates |
| `latency_ms` | Response time |
| `endpoint` | API endpoint that triggered the call |

### Database Schema

```sql
-- AI Usage Log (per-call tracking)
CREATE TABLE predict.ai_usage_log (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL,
  model VARCHAR(100) NOT NULL,
  provider VARCHAR(50) NOT NULL,
  endpoint VARCHAR(255),
  prompt_tokens INTEGER DEFAULT 0,
  completion_tokens INTEGER DEFAULT 0,
  estimated_cost_cents INTEGER DEFAULT 0,
  latency_ms INTEGER,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Organization AI Settings (BYOK + aggregates)
CREATE TABLE predict.organization_ai_settings (
  organization_id UUID PRIMARY KEY,
  provider VARCHAR(50) DEFAULT 'agencio',
  openai_key_encrypted TEXT,
  anthropic_key_encrypted TEXT,
  preferred_model VARCHAR(50) DEFAULT 'auto',
  total_tokens_used BIGINT DEFAULT 0,
  monthly_token_limit BIGINT,
  last_api_call_at TIMESTAMPTZ,
  updated_at TIMESTAMPTZ DEFAULT NOW()
);
```

### Logging Usage

```typescript
import { logAIUsage } from '@agencio-predict/be/user/ai-settings-service';

// After each AI API call
await logAIUsage(
  userId,
  organizationId,
  'claude-3-opus',      // model
  'anthropic',          // provider
  500,                  // prompt tokens
  1200,                 // completion tokens
  1850,                 // latency ms
  '/api/predict/v1/ai-engine/predict'  // endpoint
);
```

### Retrieving Usage Stats

```typescript
import { getOrgAIUsageStats } from '@agencio-predict/be/user/ai-settings-service';

const stats = await getOrgAIUsageStats(organizationId, 30); // Last 30 days

// Returns:
{
  totalTokens: 1500000,
  totalCalls: 2500,
  estimatedCostCents: 4500,  // $45.00
  byModel: {
    'claude-3-opus': { tokens: 1000000, calls: 1500, cost: 3750 },
    'claude-3-haiku': { tokens: 500000, calls: 1000, cost: 750 }
  },
  byDay: [
    { date: '2026-04-20', tokens: 50000, calls: 100 },
    { date: '2026-04-19', tokens: 45000, calls: 95 },
    // ...
  ]
}
```

## Metering & Billing Integration

AI token usage is integrated with the subscription billing system.

### Usage Limits by Plan

| Plan | AI Tokens/Month | Predictions | API Calls |
|------|-----------------|-------------|-----------|
| Free | 10,000 | 100 | 1,000 |
| Pro | 500,000 | 5,000 | 50,000 |
| Enterprise | 5,000,000 | Unlimited | Unlimited |

### Recording AI Usage for Billing

```typescript
import { recordAiTokens } from '@agencio-predict/be/billing';

// Record AI token usage for billing purposes
await recordAiTokens(
  organizationId,
  totalTokens,
  userId,
  { model: 'claude-3-opus', endpoint: '/ai-engine/predict' }
);
```

### Checking Usage Limits

```typescript
import { checkUsageLimit } from '@agencio-predict/be/billing';

// Before making an AI call, check if user has tokens remaining
const check = await checkUsageLimit(organizationId, 'ai_tokens', estimatedTokens);

if (!check.allowed) {
  return NextResponse.json({
    error: 'AI token limit exceeded',
    upgradeRequired: true,
    used: check.used,
    limit: check.limit,
    remaining: check.remaining,
  }, { status: 402 });
}
```

### Usage Dashboard

Users can view their AI usage at `/settings/billing`:

```typescript
// Get current usage summary
const usage = await getCurrentUsage(organizationId);

// Returns:
{
  organizationId: '...',
  period: { start: '2026-04-01', end: '2026-05-01' },
  aiTokens: {
    used: 350000,
    limit: 500000,
    percentage: 70,
    isUnlimited: false,
    isExceeded: false
  },
  // ... other metrics
}
```

## Cost Calculation

Token costs are calculated per model:

| Model | Cost per 1K Tokens |
|-------|-------------------|
| gpt-4o | $1.50 |
| gpt-4o-mini | $0.075 |
| gpt-4-turbo | $3.00 |
| claude-3-opus | $7.50 |
| claude-3-sonnet | $1.50 |
| claude-3-haiku | $0.125 |

```typescript
// From ai-settings-service.ts
const costPerThousandTokens: Record<string, number> = {
  'gpt-4o': 1.5,
  'gpt-4o-mini': 0.075,
  'gpt-4-turbo': 3.0,
  'claude-3-opus': 7.5,
  'claude-3-sonnet': 1.5,
  'claude-3-haiku': 0.125,
};

const totalTokens = promptTokens + completionTokens;
const rate = costPerThousandTokens[model] || 1.0;
const estimatedCostCents = Math.round((totalTokens / 1000) * rate * 100);
```

## User-Level Preferences

Individual users can override organization defaults for model selection:

```sql
CREATE TABLE predict.user_ai_preferences (
  user_id UUID PRIMARY KEY,
  preferred_model VARCHAR(50),  -- null = use org default
  updated_at TIMESTAMPTZ DEFAULT NOW()
);
```

### Effective Settings Resolution

```typescript
const effectiveSettings = await getEffectiveAISettings(userId);

// Resolution order:
// 1. User's preferred_model (if set)
// 2. Organization's preferred_model
// 3. 'auto' (system decides based on task)
```

## API Key Validation

Keys are validated before storage:

```typescript
import { validateApiKey } from '@agencio-predict/be/user/ai-settings-service';

const result = await validateApiKey('openai', 'sk-...');

if (!result.valid) {
  return NextResponse.json({ error: result.error }, { status: 400 });
}

// Key is valid, proceed with encryption and storage
```

### Validation Methods

| Provider | Validation Method |
|----------|------------------|
| OpenAI | `GET /v1/models` — checks for valid auth |
| Anthropic | `POST /v1/messages` — sends minimal test message |

## Security Considerations

1. **Key Encryption**: All API keys are encrypted at rest with AES-256-GCM
2. **Key Rotation**: Encryption key (`AI_SETTINGS_ENCRYPTION_KEY`) should be rotated periodically
3. **Access Logging**: All API key access is logged for audit
4. **Least Privilege**: Only decryption happens server-side; keys never sent to frontend
5. **Mask Display**: UI shows only last 4 characters of stored keys

## Environment Variables Summary

| Variable | Required | Purpose |
|----------|----------|---------|
| `LLMROUTER_BASE_URL` | Recommended | LLM Router endpoint |
| `LLMROUTER_API_KEY` | With above | LLM Router auth |
| `CLAUDE_API_KEY` | Fallback | Direct Anthropic access |
| `ANTHROPIC_API_KEY` | Fallback | Alternative to CLAUDE_API_KEY |
| `OPENAI_API_KEY` | Optional | For OpenAI-specific features |
| `AI_SETTINGS_ENCRYPTION_KEY` | Required | 32-char key for BYOK encryption |

## Key Files

| File | Purpose |
|------|---------|
| `packages/be/src/user/ai-settings-service.ts` | BYOK management, encryption, usage tracking |
| `packages/be/src/algorithms/llm/client.ts` | LLM client with provider routing |
| `packages/be/src/billing/services/metering-service.ts` | Usage metering for billing |
| `apps/web/src/app/settings/ai/page.tsx` | AI settings UI |
| `apps/web/src/app/settings/billing/page.tsx` | Usage dashboard UI |

## AI Usage Dashboard

Organization admins can view detailed AI usage at `/settings/ai-usage`:

### Features

| Feature | Admin/Owner | Member |
|---------|-------------|--------|
| Total tokens & calls | ✅ | ✅ |
| Estimated cost | ✅ | ✅ |
| Usage by model (chart) | ✅ | ❌ |
| Daily trend (chart) | ✅ | ❌ |
| Cost breakdown table | ✅ | ❌ |

### API Endpoint

```
GET /api/predict/v1/user/ai-usage?days=30
```

**Response:**
```json
{
  "totalTokens": 1500000,
  "totalCalls": 2500,
  "estimatedCostCents": 4500,
  "byModel": {
    "claude-3-opus": { "tokens": 1000000, "calls": 1500, "cost": 3750 },
    "claude-3-haiku": { "tokens": 500000, "calls": 1000, "cost": 750 }
  },
  "byDay": [
    { "date": "2026-04-20", "tokens": 50000, "calls": 100 }
  ],
  "organization": { "id": "...", "name": "...", "role": "admin" },
  "canViewDetails": true,
  "days": 30
}
```

### Charts Available

1. **Summary Cards** - Total tokens, API calls, estimated cost, avg tokens/call
2. **Bar Chart** - Token consumption by model (horizontal)
3. **Pie Chart** - Model distribution (donut style)
4. **Line Chart** - Daily usage trend with tokens and calls

## AI Integration Points

The AI/LLM system integrates across the platform:

| Feature | AI Integration | Tracked? |
|---------|---------------|----------|
| **All-Seeing Eye** | Unified AI orchestration, black swan detection | Yes |
| **AI Engine** | Self-learning predictions, accuracy tracking | Yes |
| **Algorithm Builder** | DSL evaluation, LLM critique, jury system | Yes |
| **Create Trade (AI-Assisted)** | Trade signals with confidence, entry/exit levels | Yes |
| **Marketing Predictor** | Campaign performance forecasting | Yes |

### Create Trade → AI-Assisted Mode

The `/trades` page includes an AI-Assisted trade creation mode:

1. User enters symbol
2. System fetches signal from AI Engine (`GET /api/predict/v1/ai-engine/signal`)
3. AI returns: direction, confidence, entry/exit levels, reasoning
4. User reviews and confirms trade
5. Trade logged with `source: 'ai_signal'` for performance tracking

This creates a feedback loop:
- AI Engine generates signals
- Users execute trades based on signals
- All-Seeing Eye aggregates performance across all sources
- AI learns from outcomes to improve future signals

## Platform Predictions (All-Seeing Eye Generated)

The All-Seeing Eye generates platform predictions that appear in the Terminal Console feed. These are fully autonomous AI-generated insights with complete evidence trails.

### Architecture

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                      PLATFORM PREDICTIONS PIPELINE                          │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌──────────────────┐      ┌──────────────────┐      ┌──────────────────┐  │
│  │  Data Sources    │ ──▶  │  All-Seeing Eye  │ ──▶  │  Platform        │  │
│  │  (7 sources)     │      │  Orchestrator    │      │  Predictions     │  │
│  └──────────────────┘      └──────────────────┘      └──────────────────┘  │
│        │                          │                          │              │
│        │                          ▼                          ▼              │
│        │                   ┌──────────────┐           ┌────────────────┐   │
│        │                   │  • Ensemble  │           │  Terminal Feed │   │
│        │                   │  • BlackSwan │           │  /console      │   │
│        │                   │  • Anomaly   │           └────────────────┘   │
│        │                   └──────────────┘                                 │
│        │                                                                    │
│  ┌─────┴───────────────────────────────────────────────────────────────┐   │
│  │  Yahoo Finance • FRED • CoinGecko • Polymarket • Reddit • etc.     │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
```

### Prediction Content (Evidence Trail)

Every platform prediction includes a complete evidence trail:

| Field | Purpose | Example |
|-------|---------|---------|
| **What** | The prediction itself | "Crypto: Bullish Consensus (82%)" |
| **Why** | Reasoning/explanation | "Ensemble based on 7 signals. Vote: 5 up, 1 down, 1 flat." |
| **How** | Methodology | `insightType: 'ensemble'`, contributing signals |
| **Confidence** | How certain | 0.82 (82% confidence) |
| **Convergence** | Signal agreement | 0.78 (78% of signals align) |
| **Horizon** | Time frame | "short" (24h), "medium" (1 week), "long" (30 days) |
| **Priority** | Importance | "critical", "high", "medium", "low" |
| **Signals** | Data sources used | Top 5 contributing signals with weights |

### Database Schema

```sql
-- Migration 075
CREATE TABLE predict.platform_predictions (
  id UUID PRIMARY KEY,
  insight_id VARCHAR(100) NOT NULL UNIQUE,
  insight_type VARCHAR(50) NOT NULL,  -- 'ensemble', 'black-swan', 'correlation', 'anomaly'

  -- The Prediction (WHAT)
  title VARCHAR(500) NOT NULL,
  direction VARCHAR(20) NOT NULL,     -- 'up', 'down', 'flat', 'volatile'
  probability DECIMAL(5,4) NOT NULL,
  confidence DECIMAL(5,4) NOT NULL,

  -- The Evidence (WHY/HOW)
  description TEXT,                   -- Detailed reasoning
  convergence_score DECIMAL(5,4),     -- How well signals agree
  contributing_signals JSONB,         -- Array of signal contributions
  source_data JSONB,                  -- Raw source context

  -- Time Context (WHEN)
  horizon VARCHAR(20) NOT NULL,       -- 'short', 'medium', 'long'
  horizon_hours INTEGER,
  expires_at TIMESTAMPTZ,

  -- Resolution Tracking (OUTCOME)
  status VARCHAR(30) DEFAULT 'active',
  resolved_at TIMESTAMPTZ,
  actual_direction VARCHAR(20),
  accuracy_score DECIMAL(5,4),

  created_at TIMESTAMPTZ DEFAULT NOW()
);
```

### API Endpoints

| Method | Endpoint | Purpose |
|--------|----------|---------|
| GET | `/api/predict/v1/platform-predictions` | List active predictions |
| GET | `/api/predict/v1/platform-predictions/status` | System status |
| POST | `/api/predict/v1/platform-predictions/generate` | Manual trigger |

### Console Display

Platform predictions appear in the Terminal Console (`/console`) with:

- **PLATFORM** badge (cyan color) distinguishing from user predictions
- **Type** badge (ensemble, black-swan, etc.)
- **Direction indicator** (▲ up, ▼ down, ~ flat, ⚡ volatile)
- **Confidence** and **Convergence** percentages
- **Top signals** that contributed to the prediction
- **Priority** badge with color coding

### Scheduler Integration

```typescript
// Platform predictions job runs every 15 minutes
registerJob('platform-predictions', 15 * 60 * 1000, platformPredictionsJob);

async function platformPredictionsJob() {
  // 1. Expire old predictions
  await expireOldPredictions();

  // 2. Generate from current All-Seeing Eye insights
  const predictions = await generatePlatformPredictions();
}
```

### Self-Learning

Platform predictions create a feedback loop:
1. All-Seeing Eye generates prediction with confidence
2. Prediction is stored with full evidence trail
3. When resolution date arrives, actual outcome is recorded
4. Accuracy score calculated per prediction
5. Signal weight learner uses outcomes to improve future predictions

## Troubleshooting

### "No LLM provider configured"

Ensure at least one of:
- `LLMROUTER_BASE_URL` + `LLMROUTER_API_KEY`
- `CLAUDE_API_KEY` or `ANTHROPIC_API_KEY`

### "Usage limit exceeded" (402)

User has exceeded their plan's AI token limit. Options:
1. Upgrade to a higher plan
2. Wait for billing period reset
3. Use organization BYOK to bypass platform limits

### Key validation failing

1. Check key format (OpenAI: `sk-...`, Anthropic: `sk-ant-...`)
2. Ensure key has not been revoked
3. Verify account has billing enabled on the provider side