# Agencio Predict - Security Overview

## Threat Model

Agencio Predict handles prediction market data, marketing campaign intelligence,
external API credentials, and user authentication. The attack surface spans:

```
┌─────────────────────────────────────────────────────────────────────┐
│                         THREAT LANDSCAPE                            │
│                                                                     │
│  EXTERNAL                          INTERNAL                         │
│  ├─ API abuse / scraping           ├─ Service-to-service spoofing   │
│  ├─ Auth bypass / token theft      ├─ Secret leakage from logs      │
│  ├─ Webhook injection              ├─ Credential store compromise   │
│  ├─ Feed data poisoning            ├─ Insider manipulation of       │
│  ├─ XSS / CSRF on terminal UI     │   prediction events            │
│  ├─ SQL injection via filters      ├─ Privilege escalation          │
│  ├─ Rate limit bypass              ├─ Unmatched signal injection    │
│  ├─ DDoS on public endpoints      └─ Feed adapter code injection   │
│  ├─ OAuth token interception                                        │
│  ├─ Webhook replay attacks                                          │
│  └─ Supply chain (npm / PyPI)                                       │
└─────────────────────────────────────────────────────────────────────┘
```

---

## 1. Authentication & Identity

### 1.1 Primary Auth: Agencio Authentication Service (bertha-auth-service)

Agencio Predict delegates all user authentication to the existing **Agencio
Authentication Service** (`bertha-auth-service` v2.0.0). This is a multi-tenant
Node.js microservice already in production. No custom auth is built for Predict.

```
┌─────────────────────────────────────────────────────────────────────┐
│                   Agencio Authentication Service                    │
│                   (bertha-auth-service v2.0.0)                      │
│                                                                     │
│  Public Endpoints:                                                  │
│    POST /api/v1/auth/signin          → issue JWT                    │
│    POST /api/v1/auth/signup          → register user                │
│    POST /api/v1/auth/verify-token    → validate JWT (for services)  │
│    POST /api/v1/auth/refresh-token   → refresh JWT                  │
│    POST /api/v1/auth/reset-password  → password reset               │
│                                                                     │
│  Protected Endpoints:                                               │
│    POST /api/v1/auth/signout                                        │
│    POST /api/v1/auth/update-password                                │
│    POST /api/v1/auth/switch-organization/:id                        │
│    POST /api/v1/auth/generate-service-token                         │
│                                                                     │
│  User Management:      GET/PUT /api/v1/users/me                     │
│  Organisations:        GET /api/v1/organizations                    │
│  Memberships:          GET /api/v1/memberships                      │
│  Invitations:          GET /api/v1/invitations                      │
│  Sessions:             GET /api/v1/sessions/active                  │
│  MFA:                  POST /api/v1/auth/mfa/setup|verify|disable   │
│  SSO:                  GET /api/v1/auth/sso/login/:provider         │
│  Privacy/GDPR:         POST /api/v1/privacy/deletions               │
│  Service Tokens:       POST /api/v1/token-manager/service-token     │
│  Admin:                GET/POST/PUT/DELETE /api/v1/admin/users      │
└──────────────────────────┬──────────────────────────────────────────┘
                           │
                           │  JWT (HS256)
                           │
┌──────────────────────────▼──────────────────────────────────────────┐
│                   Agencio Predict Services                          │
│                                                                     │
│  Every service validates JWT using the SAME shared JWT_SECRET       │
│  from bertha-auth-service. Token contains:                          │
│                                                                     │
│  {                                                                  │
│    sub: "user-uuid",           // user ID                           │
│    email: "user@example.com",                                       │
│    name: "JJ",                                                      │
│    organizationId: "org-uuid", // current org context                │
│    organizationName: "Acme",                                        │
│    role: "admin",              // role in current org                │
│    globalRole: "user",         // system-wide role                   │
│    permissions: ["events:read", "rules:own", ...],                  │
│    organizations: [{ id, name, role }],                              │
│    type: "user",               // "user" | "service"                │
│    iat: 1711234567,                                                 │
│    exp: 1711238167                                                  │
│  }                                                                  │
└─────────────────────────────────────────────────────────────────────┘
```

**How Predict integrates:**

1. **Frontend** redirects to auth service login page (or embeds auth components)
2. Auth service issues JWT signed with `JWT_SECRET` (HS256)
3. Frontend stores JWT and sends `Authorization: Bearer <token>` on all requests
4. **Every Predict service** validates JWT locally using the same `JWT_SECRET`
5. User context (`req.user`) extracted from token claims — no DB lookup needed
6. For critical operations, services call `POST /api/v1/auth/verify-token` to cross-check

```typescript
// middleware/auth.ts — Predict services reuse bertha-auth-service JWT pattern
import jwt from 'jsonwebtoken';

interface PredictUser {
  id: string;
  email: string;
  name: string;
  organizationId: string;
  organizationName: string;
  role: string;             // role in current org
  globalRole: string;       // system-wide role
  permissions: string[];
  organizations: { id: string; name: string; role: string }[];
  tokenType: 'user' | 'service';
}

function authenticate(req: Request, res: Response, next: NextFunction) {
  const authHeader = req.headers.authorization;
  if (!authHeader?.startsWith('Bearer ')) {
    return res.status(401).json({ success: false, message: 'Token missing', code: 'NO_TOKEN' });
  }

  const token = authHeader.substring(7);

  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET!, {
      algorithms: ['HS256', 'HS384', 'HS512']
    }) as any;

    req.user = {
      id: decoded.sub || decoded.userId || decoded.id,
      email: decoded.email,
      name: decoded.name,
      organizationId: decoded.organizationId,
      organizationName: decoded.organizationName,
      role: decoded.role,
      globalRole: decoded.globalRole,
      permissions: decoded.permissions || [],
      organizations: decoded.organizations || [],
      tokenType: decoded.type || 'user',
    };

    next();
  } catch (error) {
    if (error.name === 'TokenExpiredError') {
      return res.status(401).json({ success: false, message: 'Token expired', code: 'TOKEN_EXPIRED' });
    }
    return res.status(401).json({ success: false, message: 'Invalid token', code: 'INVALID_TOKEN' });
  }
}
```

**What we do NOT build:**
- No signup/signin logic in Predict (auth service handles it)
- No password management (auth service handles it)
- No MFA (auth service supports TOTP + backup codes)
- No SSO (auth service supports OAuth providers)
- No session management (auth service tracks active sessions)
- No GDPR deletion (auth service has `/privacy/deletions` endpoint — Predict
  listens for deletion webhook and purges user data from Predict tables)

### 1.2 Service-to-Service Authentication

The auth service provides a dedicated **service token** mechanism via its
token management API. Predict services use this instead of shared static keys.

```
┌──────────────────┐  POST /token-manager/service-token/predict-intel  ┌──────────────────┐
│  predict-worker  │ ────────────────────────────────────────────────▶  │  Auth Service    │
│                  │ ◀────────────────────────────────────────────────  │                  │
│                  │  { token: "eyJ...", expiresIn: 3600 }             │                  │
└────────┬─────────┘                                                   └──────────────────┘
         │
         │  Authorization: Bearer <service-token>
         │  X-Service-Name: predict-worker
         │  X-Request-ID: uuid
         ▼
┌──────────────────┐
│  predict-intel   │  → validates service token using same JWT_SECRET
│                  │  → checks token.type === 'service'
│                  │  → checks token.serviceName is allowed caller
└──────────────────┘
```

**Controls:**
- Service tokens issued by auth service's token manager (`/api/v1/token-manager/service-token/:serviceName`)
- Tokens are short-lived (1 hour), auto-refreshed by the calling service
- Service tokens stored in Redis (auth service manages lifecycle)
- `X-Request-ID` propagated for distributed tracing (UUID per request chain)
- Internal endpoints NOT exposed on ALB — only reachable within VPC/Docker network
- In production: security groups restrict traffic to known service IPs
- Auth service validates service token scopes via `servicePermissions.config.js`

```typescript
// middleware/service-auth.ts
function validateServiceToken(req: Request, res: Response, next: NextFunction) {
  const token = req.headers.authorization?.substring(7);
  const serviceName = req.headers['x-service-name'] as string;

  if (!token || !serviceName) {
    return res.status(401).json({ success: false, code: 'MISSING_SERVICE_AUTH' });
  }

  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET!) as any;

    if (decoded.type !== 'service') {
      return res.status(403).json({ success: false, code: 'NOT_SERVICE_TOKEN' });
    }

    if (decoded.serviceName !== serviceName) {
      return res.status(403).json({ success: false, code: 'SERVICE_NAME_MISMATCH' });
    }

    // Check if this service is allowed to call this endpoint
    const allowedCallers = SERVICE_PERMISSIONS[req.path];
    if (allowedCallers && !allowedCallers.includes(serviceName)) {
      return res.status(403).json({ success: false, code: 'SERVICE_NOT_AUTHORIZED' });
    }

    req.service = { name: serviceName, token: decoded };
    next();
  } catch (error) {
    return res.status(401).json({ success: false, code: 'INVALID_SERVICE_TOKEN' });
  }
}
```

### 1.3 API Key Authentication (External API Product)

For the API product (developers, partners), API keys are used instead of JWT.

**Controls:**
- API keys are 256-bit random, prefixed with `pk_live_` or `pk_test_`
- Keys hashed with SHA-256 before storage (original never stored)
- Key lookup via hash — O(1) with index
- Rate limiting applied per key based on subscription tier
- Keys can be scoped to specific endpoints (read-only, signals-only, etc.)
- Key rotation: new key issued, old key valid for 24h grace period
- Revocation is immediate — revoked keys cached in Redis for fast rejection

```sql
-- API key storage (only the hash is stored)
CREATE TABLE prediction_api_keys (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES auth.users(id),
  name TEXT NOT NULL,                    -- "Production key", "Dev key"
  key_prefix TEXT NOT NULL,              -- "pk_live_abc1" (first 12 chars for display)
  key_hash TEXT NOT NULL UNIQUE,         -- SHA-256 hash of full key
  scopes TEXT[] DEFAULT '{read}',        -- ['read', 'signals', 'trust', 'marketing', 'write']
  tier TEXT NOT NULL DEFAULT 'developer',
  rate_limit_rpm INTEGER NOT NULL DEFAULT 10,
  rate_limit_rpd INTEGER NOT NULL DEFAULT 100,
  last_used_at TIMESTAMPTZ,
  last_used_ip INET,
  is_active BOOLEAN DEFAULT true,
  expires_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ DEFAULT now(),
  revoked_at TIMESTAMPTZ
);

CREATE INDEX idx_api_keys_hash ON prediction_api_keys(key_hash) WHERE is_active = true;
CREATE INDEX idx_api_keys_user ON prediction_api_keys(user_id);
```

### 1.4 Role-Based Access Control (RBAC)

```
┌───────────┬─────────────────────────────────────────────────────────┐
│ Role      │ Permissions                                            │
├───────────┼─────────────────────────────────────────────────────────┤
│ user      │ events:read (delayed), rules:own (3 max),              │
│ (free)    │ campaigns:own (5/mo)                                   │
├───────────┼─────────────────────────────────────────────────────────┤
│ pro       │ events:read (realtime), events:export,                 │
│           │ rules:own (unlimited), campaigns:own (unlimited),      │
│           │ trust:read, calibration:read, explanations:read        │
├───────────┼─────────────────────────────────────────────────────────┤
│ team_admin│ ...pro + rules:team, campaigns:team,                   │
│           │ feeds:manage, members:manage, billing:manage            │
├───────────┼─────────────────────────────────────────────────────────┤
│ admin     │ * (all permissions)                                     │
│ (system)  │ system_rules:manage, feeds:system, events:resolve,     │
│           │ users:manage, audit:read                                │
├───────────┼─────────────────────────────────────────────────────────┤
│ api       │ Scoped by API key: events:read, signals:read,          │
│ (service) │ trust:read, calibration:read (no write by default)     │
└───────────┴─────────────────────────────────────────────────────────┘
```

**Enforcement:**

```typescript
// middleware/rbac.ts
function requirePermission(...permissions: string[]) {
  return (req: AuthenticatedRequest, res: Response, next: NextFunction) => {
    const userPermissions = resolvePermissions(req.user.roles);
    const hasAll = permissions.every(p => userPermissions.includes(p) || userPermissions.includes('*'));
    if (!hasAll) {
      return res.status(403).json({ error: { code: 'FORBIDDEN', message: 'Insufficient permissions' } });
    }
    next();
  };
}

// Usage
router.post('/events/:id/resolve', requirePermission('events:resolve'), resolveEventHandler);
router.delete('/feeds/:id', requirePermission('feeds:manage'), deleteFeedHandler);
```

---

## 2. Data Protection

### 2.1 Encryption

| Layer | Mechanism | Details |
|-------|-----------|---------|
| In transit (external) | TLS 1.3 | ALB terminates TLS, CloudFront enforces HTTPS |
| In transit (internal) | TLS 1.2+ (prod) / plaintext (dev) | ECS service mesh or VPC-internal TLS |
| At rest (database) | AES-256 via RDS encryption | Enabled at instance level |
| At rest (S3) | AES-256 via SSE-KMS | KMS key per bucket |
| At rest (Redis) | AES-256 via ElastiCache encryption | Encryption at-rest enabled |
| Credentials in DB | AES-256-GCM application-level | Encrypted before INSERT, decrypted on read |
| Secrets Manager | AWS KMS envelope encryption | Automatic rotation supported |

### 2.2 Credential Encryption (Application Level)

External platform credentials (Slack tokens, Google Ads OAuth, feed API keys)
are encrypted at the application layer before database storage.

```typescript
// crypto/credentials.ts
import { createCipheriv, createDecipheriv, randomBytes } from 'crypto';

const ALGORITHM = 'aes-256-gcm';

function encryptCredentials(plaintext: object): EncryptedPayload {
  const key = Buffer.from(process.env.CREDENTIALS_ENCRYPTION_KEY!, 'hex'); // 32 bytes
  const iv = randomBytes(16);
  const cipher = createCipheriv(ALGORITHM, key, iv);

  let encrypted = cipher.update(JSON.stringify(plaintext), 'utf8', 'hex');
  encrypted += cipher.final('hex');
  const authTag = cipher.getAuthTag().toString('hex');

  return { encrypted, iv: iv.toString('hex'), authTag, version: 1 };
}

function decryptCredentials(payload: EncryptedPayload): object {
  const key = Buffer.from(process.env.CREDENTIALS_ENCRYPTION_KEY!, 'hex');
  const decipher = createDecipheriv(ALGORITHM, key, Buffer.from(payload.iv, 'hex'));
  decipher.setAuthTag(Buffer.from(payload.authTag, 'hex'));

  let decrypted = decipher.update(payload.encrypted, 'hex', 'utf8');
  decrypted += decipher.final('utf8');
  return JSON.parse(decrypted);
}
```

**Key management:**
- `CREDENTIALS_ENCRYPTION_KEY` stored in AWS Secrets Manager (never in code/env)
- Key rotation: encrypt with new key, re-encrypt existing records in background job
- Version field on payload allows gradual migration during rotation

### 2.3 Sensitive Data Handling

| Data Type | Storage | Access | Logging |
|-----------|---------|--------|---------|
| User passwords | Never stored (delegated to Cognito/Agencio) | N/A | Never logged |
| OAuth tokens | Encrypted JSONB in DB | Decrypted only at action execution time | Token value never logged |
| API keys | SHA-256 hash only | Original shown once at creation | Prefix only in logs |
| Feed credentials | Encrypted JSONB in DB | Decrypted only during feed runs | Never logged |
| Webhook URLs | Plaintext in DB (user's own) | User access only via RLS | Domain only in logs |
| Prediction data | Plaintext in DB | RLS + RBAC | Full logging OK |
| User PII (email/name) | Plaintext in DB | Auth provider manages | Email masked in logs |

### 2.4 Data Classification

```
┌─────────────┬────────────────────────────────────────────┐
│ CRITICAL    │ OAuth tokens, API keys, encryption keys,   │
│             │ feed credentials, Secrets Manager values    │
│             │ → Encrypted at rest + in transit            │
│             │ → Access logged and auditable               │
│             │ → Auto-rotated where possible               │
├─────────────┼────────────────────────────────────────────┤
│ SENSITIVE   │ User emails, campaign data, trust flags,   │
│             │ rule configurations, billing info           │
│             │ → Encrypted at rest (RDS/S3 encryption)     │
│             │ → RLS enforced                              │
│             │ → PII masked in logs                        │
├─────────────┼────────────────────────────────────────────┤
│ INTERNAL    │ Prediction signals, snapshots, explanations,│
│             │ calibration scores, feed run logs           │
│             │ → Encrypted at rest (RDS encryption)        │
│             │ → Standard access controls                  │
├─────────────┼────────────────────────────────────────────┤
│ PUBLIC      │ Published prediction events (free tier),    │
│             │ calibration reports, public API responses   │
│             │ → No special protection needed              │
└─────────────┴────────────────────────────────────────────┘
```

---

## 3. Input Validation & Injection Prevention

### 3.1 API Input Validation

Every API endpoint validates input with Zod schemas before processing.

```typescript
// validation/events.ts
import { z } from 'zod';

const CreateEventSchema = z.object({
  title: z.string().min(5).max(500).trim(),
  description: z.string().max(5000).optional(),
  category_id: z.string().uuid(),
  resolution_criteria: z.string().max(2000).optional(),
  resolution_date: z.string().datetime().optional(),
  source_urls: z.array(z.string().url().max(2000)).max(10).optional(),
  tags: z.array(z.string().max(50).regex(/^[a-z0-9-]+$/)).max(20).optional(),
});

const EventFilterSchema = z.object({
  category: z.string().max(50).optional(),
  status: z.enum(['ACTIVE', 'RESOLVED_YES', 'RESOLVED_NO', 'ALL']).default('ACTIVE'),
  min_probability: z.number().min(0).max(1).optional(),
  max_probability: z.number().min(0).max(1).optional(),
  search: z.string().max(200).optional(),
  sort: z.string().regex(/^-?[a-z_]+$/).optional(),
  page: z.number().int().min(1).max(1000).default(1),
  limit: z.number().int().min(1).max(100).default(20),
});
```

### 3.2 SQL Injection Prevention

- All database queries use parameterised queries (never string concatenation)
- Supabase client / `pg` library with `$1, $2` placeholders
- No raw SQL from user input — sort fields validated against whitelist
- `search` parameter uses `plainto_tsquery()` (not `to_tsquery()`) to prevent operator injection

```typescript
// NEVER this:
const result = await db.query(`SELECT * FROM events WHERE title LIKE '%${search}%'`);

// ALWAYS this:
const result = await db.query(
  `SELECT * FROM events WHERE to_tsvector('english', title) @@ plainto_tsquery('english', $1)`,
  [search]
);
```

### 3.3 XSS Prevention

- All user-generated text rendered via React (auto-escapes by default)
- `dangerouslySetInnerHTML` prohibited via ESLint rule
- AI-generated explanations sanitised with DOMPurify before rendering
- Content-Security-Policy header blocks inline scripts

```
Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; connect-src 'self' wss: https:;
```

### 3.4 CSRF Protection

- JWT in Authorization header (not cookies) — immune to CSRF by default
- For any cookie-based flows: `SameSite=Strict` + CSRF token
- State parameter enforced on all OAuth flows

### 3.5 Webhook Input Validation

Inbound webhooks (custom feed push, external events) are untrusted input.

```typescript
// Webhook receiver validation
function validateWebhookPayload(feedId: string, req: Request): boolean {
  // 1. Verify HMAC signature
  const signature = req.headers['x-predict-signature'];
  const expectedSig = hmacSHA256(req.body, feed.webhook_secret);
  if (!timingSafeEqual(signature, expectedSig)) return false;

  // 2. Check timestamp (prevent replay)
  const timestamp = req.headers['x-predict-timestamp'];
  if (Math.abs(Date.now() - parseInt(timestamp)) > 300_000) return false; // 5 min window

  // 3. Validate payload schema
  const parsed = WebhookPayloadSchema.safeParse(req.body);
  if (!parsed.success) return false;

  // 4. Check IP allowlist (if configured)
  if (feed.allowed_ips?.length > 0) {
    if (!feed.allowed_ips.includes(req.ip)) return false;
  }

  return true;
}
```

### 3.6 Feed Data Poisoning Prevention

Malicious actors could submit fake signals to manipulate predictions.

**Controls:**
- Signals from external feeds are tagged with source reliability score
- Anomaly detection flags signals that deviate significantly from other sources
- Trust layer detects coordinated manipulation patterns
- Source diversity scoring — single-source-driven movements flagged
- Manual review queue for high-impact signals from low-trust sources
- Rate limiting per feed (max signals per window)

---

## 4. Rate Limiting & Abuse Prevention

### 4.1 Rate Limit Architecture

```
┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
│  WAF     │────▶│  ALB     │────▶│  App     │────▶│  Redis   │
│  (L7)    │     │          │     │  Layer   │     │  Counter  │
│          │     │          │     │          │     │          │
│ IP-based │     │ Conn     │     │ Token/   │     │ Sliding  │
│ geo-block│     │ limits   │     │ Key      │     │ window   │
│ bot det  │     │          │     │ based    │     │          │
└──────────┘     └──────────┘     └──────────┘     └──────────┘
```

### 4.2 Rate Limits by Layer

| Layer | Scope | Limit | Action |
|-------|-------|-------|--------|
| WAF | Per IP | 1000 req/min | Block IP for 10 min |
| WAF | Per IP (auth endpoints) | 20 req/min | Block IP for 30 min |
| App | Per user (free) | 30 req/min | 429 response |
| App | Per user (pro) | 300 req/min | 429 response |
| App | Per API key (developer) | 10 req/min, 100/day | 429 response |
| App | Per API key (startup) | 200 req/min, 10K/day | 429 response |
| App | Per API key (growth) | 2000 req/min, 100K/day | 429 response |
| App | Webhook inbound | 60 req/min per feed | 429 + drop signal |
| App | Explanation generation | 10 req/hour per user | Queue overflow |
| App | Campaign prediction | 20 req/hour (free), unlimited (pro) | 429 response |

### 4.3 Implementation

```typescript
// middleware/rate-limit.ts
import { Redis } from 'ioredis';

async function slidingWindowRateLimit(
  redis: Redis,
  key: string,
  limit: number,
  windowSeconds: number
): Promise<{ allowed: boolean; remaining: number; resetAt: number }> {
  const now = Date.now();
  const windowStart = now - windowSeconds * 1000;

  const pipeline = redis.pipeline();
  pipeline.zremrangebyscore(key, 0, windowStart);  // Remove expired entries
  pipeline.zadd(key, now, `${now}-${Math.random()}`);  // Add current request
  pipeline.zcard(key);  // Count requests in window
  pipeline.expire(key, windowSeconds);  // Set TTL

  const results = await pipeline.exec();
  const count = results![2][1] as number;

  return {
    allowed: count <= limit,
    remaining: Math.max(0, limit - count),
    resetAt: Math.ceil((windowStart + windowSeconds * 1000) / 1000),
  };
}
```

### 4.4 Anti-Scraping

Prediction data has commercial value — scraping must be prevented.

**Controls:**
- Free tier: 24-hour delayed data (scrapers get stale data)
- Pagination enforced (max 100 items, no `OFFSET > 10000`)
- Response fingerprinting (invisible watermarks in JSON key ordering per user)
- Honeypot endpoints that flag scraper bots
- User-Agent validation + browser fingerprinting on web UI
- API responses include `X-RateLimit-Remaining` and `X-RateLimit-Reset` headers

---

## 5. Webhook & Action Security

### 5.1 Outbound Webhook Signing

When Agencio Predict sends webhooks (rule actions), they are signed so
receivers can verify authenticity.

```typescript
function signWebhookPayload(payload: object, secret: string): string {
  const timestamp = Math.floor(Date.now() / 1000);
  const body = JSON.stringify(payload);
  const signatureInput = `${timestamp}.${body}`;
  const signature = crypto.createHmac('sha256', secret).update(signatureInput).digest('hex');
  return `t=${timestamp},v1=${signature}`;
}

// Sent as header:
// X-Predict-Signature: t=1711234567,v1=abc123...
```

### 5.2 Outbound Request Restrictions

Actions that make outbound HTTP calls (webhooks, ad platform API calls)
must be restricted to prevent SSRF.

```typescript
// security/outbound.ts
const BLOCKED_HOSTS = [
  /^localhost$/i,
  /^127\.\d+\.\d+\.\d+$/,
  /^10\.\d+\.\d+\.\d+$/,
  /^172\.(1[6-9]|2\d|3[01])\.\d+\.\d+$/,
  /^192\.168\.\d+\.\d+$/,
  /^0\.0\.0\.0$/,
  /^169\.254\.\d+\.\d+$/,       // AWS metadata endpoint
  /^fd[0-9a-f]{2}:/i,            // IPv6 private
  /\.internal$/i,
  /\.local$/i,
];

function validateOutboundUrl(url: string): boolean {
  const parsed = new URL(url);

  // Block private/internal IPs
  if (BLOCKED_HOSTS.some(pattern => pattern.test(parsed.hostname))) return false;

  // Require HTTPS
  if (parsed.protocol !== 'https:') return false;

  // Block AWS metadata endpoint
  if (parsed.hostname === '169.254.169.254') return false;

  // DNS resolution check — resolve hostname and verify it's not private
  const resolved = await dns.resolve4(parsed.hostname);
  if (resolved.some(ip => isPrivateIP(ip))) return false;

  return true;
}
```

### 5.3 OAuth Security

For platform connections (Slack, Google Ads, Meta Ads):

- OAuth `state` parameter: signed JWT containing `user_id + nonce + timestamp`
- PKCE enforced where supported (Google, Meta)
- Tokens stored encrypted (see 2.2)
- Token refresh handled automatically — if refresh fails, mark connection `EXPIRED`
- Scopes: request minimum necessary (principle of least privilege)
- Redirect URIs: exact match only, no wildcards

### 5.4 Action Execution Sandboxing

Rule actions execute in a controlled environment:

- Timeout per action: 30 seconds (webhook), 60 seconds (AI generation)
- Memory limit per action execution
- No access to filesystem or environment variables from action context
- Template variables sanitised (no code execution in `{{ }}` expressions)
- Action execution is asynchronous — failures don't block the ingest pipeline

---

## 6. Infrastructure Security

### 6.1 Network Architecture (AWS)

```
┌─── Internet ──────────────────────────────────────────────┐
│                                                           │
│  ┌─── WAF ──────────────────────────────────────────┐     │
│  │  IP reputation filtering                         │     │
│  │  Rate limiting rules                             │     │
│  │  SQL injection / XSS detection                   │     │
│  │  Geo-blocking (optional)                         │     │
│  │  Bot detection (AWS Bot Control)                  │     │
│  └──────────────────────┬───────────────────────────┘     │
│                         │                                 │
│  ┌─── Public Subnet ───▼────────────────────────────┐     │
│  │  ALB (TLS termination)                           │     │
│  │  - Only ports 80 (redirect) and 443 exposed      │     │
│  │  - Health check on /api/health                    │     │
│  └──────────────────────┬───────────────────────────┘     │
│                         │                                 │
│  ┌─── Private Subnet ──▼────────────────────────────┐     │
│  │  ECS Services                                    │     │
│  │  - No public IPs                                 │     │
│  │  - Security groups: only ALB → service port      │     │
│  │  - Service-to-service: only within SG            │     │
│  └──────────────────────┬───────────────────────────┘     │
│                         │                                 │
│  ┌─── Data Subnet ─────▼────────────────────────────┐     │
│  │  RDS, ElastiCache, S3 VPC Endpoint               │     │
│  │  - No public access                              │     │
│  │  - Security groups: only ECS services → DB port   │     │
│  │  - S3: VPC endpoint (no internet traversal)       │     │
│  └──────────────────────────────────────────────────┘     │
└───────────────────────────────────────────────────────────┘
```

### 6.2 Security Groups

| Security Group | Inbound | Outbound |
|---------------|---------|----------|
| `sg-alb` | 443 from 0.0.0.0/0 | ECS service ports |
| `sg-ecs-web` | 3000 from sg-alb | 5432 (RDS), 6379 (Redis), 443 (internet for APIs) |
| `sg-ecs-services` | 8001-8004 from sg-ecs-web, sg-ecs-worker | 5432, 6379, 443 |
| `sg-ecs-worker` | None (no inbound) | 5432, 6379, 8001-8004, 443 |
| `sg-rds` | 5432 from sg-ecs-* | None |
| `sg-redis` | 6379 from sg-ecs-* | None |

### 6.3 Secrets Management

```
┌──────────────────────────────────────────────────────────┐
│  Secrets Lifecycle                                       │
│                                                          │
│  Creation: Secrets Manager console / IaC (Terraform)     │
│  Storage:  KMS envelope encryption                       │
│  Access:   IAM policy restricts to ECS task roles only   │
│  Rotation: Automatic for RDS credentials (30-day cycle)  │
│  Audit:    CloudTrail logs every GetSecretValue call      │
│  Cache:    5-minute in-memory TTL (reduce API calls)     │
│  Dev:      .env file (never committed, in .gitignore)    │
└──────────────────────────────────────────────────────────┘
```

**Rules:**
- No secrets in Docker images (build args or layers)
- No secrets in source code (pre-commit hook scans for patterns)
- No secrets in CloudWatch logs (log sanitisation middleware)
- `.env` is in `.gitignore` and `.dockerignore`
- Secrets Manager resource policy restricts access to specific IAM roles
- Credential rotation: application handles graceful reload (5-min cache TTL)

### 6.4 Container Security

- Base images: `node:20-alpine` / `python:3.12-slim` (minimal attack surface)
- Non-root user in all containers (`USER predict`)
- Read-only filesystem where possible (`readonlyRootFilesystem: true` in ECS)
- No `--privileged` flag
- Resource limits set (CPU + memory) to prevent noisy neighbour / resource exhaustion
- Image scanning: ECR image scanning on push (CVE detection)
- Dependency scanning: `npm audit` / `pip-audit` in CI pipeline
- No SSH access to containers — use ECS Exec for debugging (logged via CloudTrail)

---

## 7. Audit & Compliance

### 7.1 Audit Trail

Every security-relevant action is logged to an immutable audit trail.

```sql
CREATE TABLE prediction_audit_log (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),

  -- Who
  actor_id UUID,                         -- user ID (NULL for system actions)
  actor_type TEXT NOT NULL CHECK (actor_type IN ('user', 'service', 'system', 'api_key')),
  actor_ip INET,
  actor_user_agent TEXT,

  -- What
  action TEXT NOT NULL,                  -- 'event.create', 'rule.fire', 'feed.auth_fail', etc.
  resource_type TEXT NOT NULL,           -- 'event', 'rule', 'feed', 'campaign', 'connection', 'user'
  resource_id UUID,

  -- Details
  details JSONB,                         -- action-specific payload
  previous_state JSONB,                  -- for updates: what it was before
  new_state JSONB,                       -- for updates: what it is now

  -- Context
  request_id TEXT,                       -- X-Request-ID for tracing
  service TEXT,                          -- which service generated this

  -- Classification
  severity TEXT DEFAULT 'INFO' CHECK (severity IN ('INFO', 'WARNING', 'ALERT', 'CRITICAL')),

  created_at TIMESTAMPTZ DEFAULT now()
);

-- Partitioned by month for performance
-- In production: consider TimescaleDB or separate audit database

CREATE INDEX idx_audit_actor ON prediction_audit_log(actor_id);
CREATE INDEX idx_audit_action ON prediction_audit_log(action);
CREATE INDEX idx_audit_resource ON prediction_audit_log(resource_type, resource_id);
CREATE INDEX idx_audit_severity ON prediction_audit_log(severity) WHERE severity IN ('ALERT', 'CRITICAL');
CREATE INDEX idx_audit_created ON prediction_audit_log(created_at DESC);

-- Immutability: no UPDATE or DELETE allowed
REVOKE UPDATE, DELETE ON prediction_audit_log FROM predict_app;
```

### 7.2 Audited Actions

| Action | Severity | Details Logged |
|--------|----------|----------------|
| `auth.login` | INFO | provider, IP, user_agent |
| `auth.login_failed` | WARNING | provider, IP, reason |
| `auth.token_refresh` | INFO | provider |
| `api_key.create` | INFO | key_prefix, scopes |
| `api_key.revoke` | WARNING | key_prefix, reason |
| `event.create` | INFO | title, category |
| `event.resolve` | INFO | outcome, resolver |
| `event.resolve.dispute` | ALERT | disputer, reason |
| `rule.create` | INFO | trigger_type, action_types |
| `rule.fire` | INFO | event_id, trigger_snapshot |
| `rule.fire.failed` | WARNING | error, action_type |
| `connection.create` | INFO | platform, scopes |
| `connection.revoke` | WARNING | platform, reason |
| `connection.auth_fail` | WARNING | platform, error |
| `feed.create` | INFO | source, poll_interval |
| `feed.auth_fail` | WARNING | source, error |
| `feed.auto_disabled` | ALERT | source, consecutive_failures |
| `trust.flag_raised` | ALERT | flag_type, severity, event |
| `trust.flag_critical` | CRITICAL | flag_type, event, evidence |
| `admin.user_role_change` | ALERT | target_user, old_role, new_role |
| `admin.system_rule_change` | WARNING | rule_id, change_type |
| `export.data_export` | INFO | scope, format, row_count |
| `billing.tier_change` | INFO | old_tier, new_tier |

### 7.3 Log Sanitisation

Before writing to CloudWatch / audit log, all payloads are sanitised.

```typescript
// logging/sanitize.ts
const SENSITIVE_KEYS = [
  'password', 'secret', 'token', 'api_key', 'apiKey', 'access_token',
  'refresh_token', 'authorization', 'credentials', 'key_hash',
  'credit_card', 'ssn', 'credential',
];

function sanitizeForLogging(obj: any, depth = 0): any {
  if (depth > 10) return '[TRUNCATED]';
  if (typeof obj !== 'object' || obj === null) return obj;

  const sanitized: any = Array.isArray(obj) ? [] : {};
  for (const [key, value] of Object.entries(obj)) {
    if (SENSITIVE_KEYS.some(sk => key.toLowerCase().includes(sk))) {
      sanitized[key] = '[REDACTED]';
    } else if (typeof value === 'object') {
      sanitized[key] = sanitizeForLogging(value, depth + 1);
    } else {
      sanitized[key] = value;
    }
  }
  return sanitized;
}
```

---

## 8. Subscription & Billing Security

### 8.1 Billing Tables

```sql
CREATE TABLE prediction_subscriptions (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES auth.users(id),
  org_id UUID REFERENCES prediction_orgs(id),

  -- Plan
  tier TEXT NOT NULL DEFAULT 'free'
    CHECK (tier IN ('free', 'pro', 'team', 'enterprise', 'api_developer', 'api_startup', 'api_growth', 'api_enterprise')),

  -- Stripe integration
  stripe_customer_id TEXT UNIQUE,
  stripe_subscription_id TEXT UNIQUE,

  -- Status
  status TEXT NOT NULL DEFAULT 'active'
    CHECK (status IN ('active', 'past_due', 'cancelled', 'trialing', 'paused')),

  -- Limits
  max_events INTEGER,             -- NULL = unlimited
  max_rules INTEGER,
  max_campaigns_per_month INTEGER,
  max_feeds INTEGER,
  max_team_members INTEGER,
  api_rate_limit_rpm INTEGER,
  api_rate_limit_rpd INTEGER,

  -- Period
  current_period_start TIMESTAMPTZ,
  current_period_end TIMESTAMPTZ,
  trial_end TIMESTAMPTZ,
  cancelled_at TIMESTAMPTZ,

  created_at TIMESTAMPTZ DEFAULT now(),
  updated_at TIMESTAMPTZ DEFAULT now()
);

CREATE UNIQUE INDEX idx_subscriptions_user ON prediction_subscriptions(user_id);
CREATE INDEX idx_subscriptions_stripe ON prediction_subscriptions(stripe_customer_id);
```

### 8.2 Usage Metering

```sql
CREATE TABLE prediction_usage_metrics (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES auth.users(id),

  -- Period
  period_start TIMESTAMPTZ NOT NULL,    -- start of billing period
  period_end TIMESTAMPTZ NOT NULL,

  -- Counts
  api_calls INTEGER DEFAULT 0,
  events_tracked INTEGER DEFAULT 0,
  rules_active INTEGER DEFAULT 0,
  campaigns_created INTEGER DEFAULT 0,
  explanations_generated INTEGER DEFAULT 0,
  feeds_active INTEGER DEFAULT 0,
  signals_ingested BIGINT DEFAULT 0,
  storage_bytes_used BIGINT DEFAULT 0,

  -- Cost tracking
  llm_tokens_used BIGINT DEFAULT 0,     -- Claude API token count
  llm_cost_cents INTEGER DEFAULT 0,     -- estimated Claude API cost

  updated_at TIMESTAMPTZ DEFAULT now()
);

CREATE UNIQUE INDEX idx_usage_user_period ON prediction_usage_metrics(user_id, period_start);
```

### 8.3 Feature Gating

```typescript
// middleware/feature-gate.ts
async function enforceFeatureGate(userId: string, feature: string): Promise<boolean> {
  const subscription = await getSubscription(userId);

  switch (feature) {
    case 'realtime_data':
      return subscription.tier !== 'free';
    case 'explanations':
      return subscription.tier !== 'free';
    case 'campaign_prediction':
      if (subscription.tier === 'free') return false;
      if (subscription.max_campaigns_per_month) {
        const usage = await getCurrentUsage(userId);
        return usage.campaigns_created < subscription.max_campaigns_per_month;
      }
      return true;
    case 'trust_details':
      return ['pro', 'team', 'enterprise'].includes(subscription.tier);
    case 'data_export':
      return subscription.tier !== 'free';
    case 'custom_feeds':
      return ['team', 'enterprise'].includes(subscription.tier);
    default:
      return true;
  }
}
```

---

## 9. Multi-Tenancy & Data Isolation

### 9.1 Organisation Model

```sql
CREATE TABLE prediction_orgs (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name TEXT NOT NULL,
  slug TEXT NOT NULL UNIQUE,
  owner_id UUID NOT NULL REFERENCES auth.users(id),

  -- Settings
  settings JSONB DEFAULT '{}',

  -- Subscription link
  subscription_id UUID REFERENCES prediction_subscriptions(id),

  created_at TIMESTAMPTZ DEFAULT now(),
  updated_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE prediction_org_members (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  org_id UUID NOT NULL REFERENCES prediction_orgs(id) ON DELETE CASCADE,
  user_id UUID NOT NULL REFERENCES auth.users(id),
  role TEXT NOT NULL DEFAULT 'member'
    CHECK (role IN ('owner', 'admin', 'member', 'viewer')),

  invited_by UUID REFERENCES auth.users(id),
  invited_at TIMESTAMPTZ DEFAULT now(),
  accepted_at TIMESTAMPTZ,

  created_at TIMESTAMPTZ DEFAULT now()
);

CREATE UNIQUE INDEX idx_org_members_unique ON prediction_org_members(org_id, user_id);
CREATE INDEX idx_org_members_user ON prediction_org_members(user_id);
```

### 9.2 Tenant-Aware RLS

```sql
-- Rules: accessible by owner OR same org members
ALTER TABLE prediction_rules ENABLE ROW LEVEL SECURITY;

CREATE POLICY "Users see own rules or team rules"
  ON prediction_rules FOR SELECT
  USING (
    user_id = auth.uid()
    OR EXISTS (
      SELECT 1 FROM prediction_org_members om1
      JOIN prediction_org_members om2 ON om1.org_id = om2.org_id
      WHERE om1.user_id = auth.uid()
        AND om2.user_id = prediction_rules.user_id
        AND prediction_rules.scope_type != 'all' -- 'all' scope requires admin
    )
  );

-- Campaigns: private to user within their org
ALTER TABLE prediction_campaigns ENABLE ROW LEVEL SECURITY;

CREATE POLICY "Users see own campaigns or org campaigns"
  ON prediction_campaigns FOR SELECT
  USING (
    user_id = auth.uid()
    OR EXISTS (
      SELECT 1 FROM prediction_org_members om1
      JOIN prediction_org_members om2 ON om1.org_id = om2.org_id
      WHERE om1.user_id = auth.uid()
        AND om2.user_id = prediction_campaigns.user_id
    )
  );

-- Action connections: ALWAYS private to user (never shared)
ALTER TABLE prediction_action_connections ENABLE ROW LEVEL SECURITY;

CREATE POLICY "Connections are private to user"
  ON prediction_action_connections FOR ALL
  USING (user_id = auth.uid());

-- API keys: ALWAYS private to user
ALTER TABLE prediction_api_keys ENABLE ROW LEVEL SECURITY;

CREATE POLICY "API keys are private to user"
  ON prediction_api_keys FOR ALL
  USING (user_id = auth.uid());
```

---

## 10. Incident Response

### 10.1 Security Monitoring

| Signal | Detection | Response |
|--------|-----------|----------|
| Multiple failed logins | >5 failures in 10 min per IP | Block IP for 30 min, notify user |
| API key abuse | Rate limit exceeded 3x in 1 hour | Temporarily suspend key, email owner |
| Trust flag CRITICAL | Automated detection | Notify admin, log to audit, in-app alert |
| Feed auth failures | 3+ consecutive auth failures | Auto-disable feed, email user |
| Unusual data export | >10K rows exported in 1 hour | Alert admin, throttle exports |
| OAuth token theft attempt | Token used from new IP + user-agent | Invalidate token, notify user |
| SQL injection attempt | WAF rule match | Block request, log payload, alert security |
| Service key leak | Key appears in logs or public repo | Rotate immediately (automated via GitHub secret scanning) |

### 10.2 Incident Severity Levels

| Level | Definition | Response Time | Example |
|-------|-----------|---------------|---------|
| P1 | Data breach, credential compromise | 15 min | Secrets leaked, DB exposed |
| P2 | Service-level security failure | 1 hour | Auth bypass, privilege escalation |
| P3 | Detected attack (blocked) | 4 hours | WAF blocked injection, rate limit abuse |
| P4 | Security improvement needed | Next sprint | Missing CSP header, dependency CVE |

### 10.3 Automated Responses

```typescript
// security/auto-response.ts
const autoResponders: SecurityResponder[] = [
  {
    signal: 'auth.login_failed',
    threshold: 5,
    window: '10m',
    action: async (events) => {
      await blockIP(events[0].ip, '30m');
      await notifyUser(events[0].userId, 'Multiple failed login attempts detected');
      await auditLog('security.ip_blocked', { ip: events[0].ip, reason: 'brute_force' });
    },
  },
  {
    signal: 'api_key.rate_exceeded',
    threshold: 3,
    window: '1h',
    action: async (events) => {
      await suspendApiKey(events[0].keyId, '1h');
      await emailKeyOwner(events[0].keyId, 'API key temporarily suspended due to rate limit abuse');
      await auditLog('security.key_suspended', { keyId: events[0].keyId });
    },
  },
];
```

---

## 11. Compliance Considerations

### 11.1 Regulatory Landscape

| Regulation | Relevance | Status |
|-----------|-----------|--------|
| GDPR | User PII (EU users) | Data minimisation, right to deletion, consent |
| CCPA | User PII (California users) | Similar to GDPR — opt-out, deletion |
| SOC 2 Type II | Enterprise customers will require | Target for Year 2 |
| Gambling regulations | Prediction markets are adjacent | Position as analytics, NOT betting |
| Financial data regulations | Market data redistribution | License data sources properly |
| CFTC/SEC | If any financial products created | Legal review needed Phase 6 |

### 11.2 Data Retention

```sql
-- Retention policy (enforced by scheduled job)

-- Feed runs: 90 days
DELETE FROM prediction_feed_runs WHERE created_at < now() - interval '90 days';

-- Audit log: 2 years (compliance requirement)
-- Move to cold storage (S3) after 90 days, delete after 2 years

-- Snapshots: 1 year at full resolution, downsample older
-- After 90 days: keep 1 snapshot per hour (delete intermediate)
-- After 1 year: keep 1 snapshot per day

-- Rule executions: 90 days
DELETE FROM prediction_rule_executions WHERE started_at < now() - interval '90 days';

-- Unmatched signals: 30 days
DELETE FROM prediction_unmatched_signals
  WHERE status IN ('DISCARDED', 'MATCHED') AND created_at < now() - interval '30 days';

-- User data: retained until account deletion requested
-- On deletion: anonymise audit logs, delete personal data, retain aggregated stats
```

### 11.3 GDPR Right to Deletion

```typescript
async function handleDeletionRequest(userId: string): Promise<void> {
  // 1. Delete personal campaigns and variants
  await db.query('DELETE FROM prediction_campaign_variants WHERE campaign_id IN (SELECT id FROM prediction_campaigns WHERE user_id = $1)', [userId]);
  await db.query('DELETE FROM prediction_campaigns WHERE user_id = $1', [userId]);

  // 2. Delete rules and executions
  await db.query('DELETE FROM prediction_rule_executions WHERE rule_id IN (SELECT id FROM prediction_rules WHERE user_id = $1)', [userId]);
  await db.query('DELETE FROM prediction_rules WHERE user_id = $1', [userId]);

  // 3. Delete connections
  await db.query('DELETE FROM prediction_action_connections WHERE user_id = $1', [userId]);

  // 4. Delete API keys
  await db.query('DELETE FROM prediction_api_keys WHERE user_id = $1', [userId]);

  // 5. Delete feeds
  await db.query('DELETE FROM prediction_data_feeds WHERE user_id = $1', [userId]);

  // 6. Anonymise audit logs (retain for compliance, remove PII)
  await db.query("UPDATE prediction_audit_log SET actor_id = NULL, actor_ip = NULL, actor_user_agent = NULL WHERE actor_id = $1", [userId]);

  // 7. Delete subscription + usage
  await db.query('DELETE FROM prediction_usage_metrics WHERE user_id = $1', [userId]);
  await db.query('DELETE FROM prediction_subscriptions WHERE user_id = $1', [userId]);

  // 8. Delete org memberships
  await db.query('DELETE FROM prediction_org_members WHERE user_id = $1', [userId]);

  // 9. Remove from auth provider
  await authProvider.deleteUser(userId);

  // 10. Log the deletion (anonymised)
  await auditLog('user.deleted', { anonymised_id: sha256(userId) });
}
```

---

## 12. Pre-Commit & CI Security Checks

### 12.1 Pre-Commit Hooks

```yaml
# .pre-commit-config.yaml
repos:
  - repo: https://github.com/gitleaks/gitleaks
    hooks:
      - id: gitleaks  # Detect secrets in commits

  - repo: local
    hooks:
      - id: no-env-files
        name: Prevent .env file commits
        entry: bash -c 'git diff --cached --name-only | grep -q "\.env" && exit 1 || exit 0'
        language: system
```

### 12.2 CI Pipeline Security

```yaml
# In CI pipeline:
steps:
  - name: Dependency audit
    run: npm audit --audit-level=high && pip-audit

  - name: Secret scanning
    run: gitleaks detect --source .

  - name: Container scanning
    run: trivy image predict-web:${{ github.sha }}

  - name: SAST (static analysis)
    run: semgrep --config=auto .

  - name: License compliance
    run: license-checker --production --failOn "GPL-3.0"
```

---

## Security Checklist by Build Phase

| Phase | Security Tasks |
|-------|---------------|
| 0 | Auth middleware (Cognito + Agencio), RBAC skeleton, input validation (Zod), `.gitignore` for secrets, pre-commit hooks |
| 1 | Feed credential encryption, rate limiting middleware, service-to-service auth key, outbound URL validation |
| 2 | Log sanitisation, CSP headers, explanation output sanitisation (DOMPurify) |
| 3 | Audit log table + writer, trust flag security alerts, flag review workflow |
| 3.5 | Webhook signing (outbound), OAuth flow security (PKCE + state), action sandboxing, SSRF prevention |
| 4 | Campaign data RLS, persona integration auth |
| 5 | Resolution dispute auth, calibration data export controls |
| 6 | WAF rules, anti-scraping, response fingerprinting |
| 7 | API key system, subscription enforcement, usage metering, Stripe webhook verification, billing audit trail, GDPR deletion flow |