# Integrations & Data Flow Architecture

This document provides a comprehensive view of all external integrations, their data flows, and how they enrich the platform's capabilities.

## Executive Summary

Agencio Predict integrates with **65+ external services** across 16 categories to provide:

- Real-time and historical market data
- Prediction market signals
- Social sentiment analysis
- Economic indicators
- LLM-powered analysis
- Trading execution

## Data Flow Overview

```
┌─────────────────────────────────────────────────────────────────────────────────┐
│                              EXTERNAL DATA SOURCES                               │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐           │
│  │ MARKET DATA │  │ PREDICTION  │  │   SOCIAL    │  │    MACRO    │           │
│  │             │  │   MARKETS   │  │  SENTIMENT  │  │  ECONOMIC   │           │
│  │ Yahoo       │  │ Polymarket  │  │ Reddit      │  │ FRED        │           │
│  │ Finnhub     │  │ Kalshi      │  │ Twitter/X   │  │ World Bank  │           │
│  │ CoinGecko   │  │ Metaculus   │  │ Telegram    │  │ FXStreet    │           │
│  │ Polygon.io  │  │ PredictIt   │  │ Discord     │  │ SEC EDGAR   │           │
│  │ Binance     │  └─────────────┘  │ Bluesky     │  └─────────────┘           │
│  │ Alpaca      │                   │ Truth Social│                             │
│  │ Frankfurter │  ┌─────────────┐  └─────────────┘  ┌─────────────┐           │
│  └─────────────┘  │ DERIVATIVES │                   │    NEWS     │           │
│                   │             │  ┌─────────────┐  │             │           │
│  ┌─────────────┐  │ Yahoo VIX   │  │     LLM     │  │ NewsAPI     │           │
│  │   BROKERS   │  │ Binance     │  │  PROVIDERS  │  │ Finnhub     │           │
│  │             │  │ Deribit     │  │             │  │ GDELT       │           │
│  │ Alpaca      │  │ Coinglass   │  │ Claude      │  │ Guardian    │           │
│  │ IBKR        │  │ Alt.me      │  │ LLMRouter   │  │ NYT         │           │
│  │ Binance     │  └─────────────┘  │ Voyage      │  │ Google News │           │
│  │ Pepperstone │                   │ OpenAI      │  └─────────────┘           │
│  │ Schwab      │                   └─────────────┘                             │
│  └─────────────┘                                                               │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘
                                        │
                                        ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                           AGENCIO PREDICT PLATFORM                              │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│  ┌─────────────────────────────────────────────────────────────────────────┐   │
│  │                         INGESTION LAYER                                  │   │
│  │                                                                         │   │
│  │  Scheduler Jobs (15min default)                                         │   │
│  │  ├─ price-collector        → market_price_snapshots                     │   │
│  │  ├─ news-ingestion         → news_archive                               │   │
│  │  ├─ social-follows         → sentiment_hourly                           │   │
│  │  ├─ prediction-markets     → market_events                              │   │
│  │  ├─ derivatives-sync       → derivatives_data                           │   │
│  │  ├─ economic-calendar      → economic_events                            │   │
│  │  ├─ stock-hunter-discovery → stock_hunter_recommendations               │   │
│  │  ├─ sync-13f-incremental   → institutional_holdings                     │   │
│  │  └─ sync-activist-filings  → activist_positions                         │   │
│  │                                                                         │   │
│  └─────────────────────────────────────────────────────────────────────────┘   │
│                                        │                                        │
│                                        ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────────┐   │
│  │                         PROCESSING LAYER                                │   │
│  │                                                                         │   │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                  │   │
│  │  │ Divergence   │  │ Tick         │  │ Pattern      │                  │   │
│  │  │ Engine       │  │ Classifier   │  │ Detector     │                  │   │
│  │  │              │  │              │  │              │                  │   │
│  │  │ Human vs Bot │  │ Whale/Bot    │  │ FVG, Candle  │                  │   │
│  │  │ Detection    │  │ Activity     │  │ DTW Match    │                  │   │
│  │  └──────────────┘  └──────────────┘  └──────────────┘                  │   │
│  │                                                                         │   │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                  │   │
│  │  │ Black Swan   │  │ Sentiment    │  │ Manipulation │                  │   │
│  │  │ Detector     │  │ Analyzer     │  │ Detector     │                  │   │
│  │  │              │  │              │  │              │                  │   │
│  │  │ Crisis Early │  │ LLM Scored   │  │ Spoof/Wash   │                  │   │
│  │  │ Warning      │  │ Sentiment    │  │ Detection    │                  │   │
│  │  └──────────────┘  └──────────────┘  └──────────────┘                  │   │
│  │                                                                         │   │
│  └─────────────────────────────────────────────────────────────────────────┘   │
│                                        │                                        │
│                                        ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────────┐   │
│  │                         INTELLIGENCE LAYER                              │   │
│  │                                                                         │   │
│  │  ┌──────────────────────────────────────────────────────────────────┐  │   │
│  │  │                    ALL-SEEING EYE                                │  │   │
│  │  │                                                                  │  │   │
│  │  │  Aggregates ALL signals into unified probability layer:         │  │   │
│  │  │  - Prediction market consensus                                  │  │   │
│  │  │  - Price momentum/technical signals                             │  │   │
│  │  │  - Sentiment divergence                                         │  │   │
│  │  │  - Institutional flows (13F)                                    │  │   │
│  │  │  - Activist positions (13D/13G)                                 │  │   │
│  │  │  - Economic surprise index                                      │  │   │
│  │  │  - Crisis resilience scores                                     │  │   │
│  │  │  - PINN ML predictions                                          │  │   │
│  │  └──────────────────────────────────────────────────────────────────┘  │   │
│  │                                                                         │   │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                  │   │
│  │  │ AI Stock    │  │ AI Fund      │  │ AI Algorithm │                  │   │
│  │  │ Hunter      │  │ Manager      │  │ Builder      │                  │   │
│  │  │             │  │              │  │              │                  │   │
│  │  │ Claude      │  │ Multi-strat  │  │ NL → DSL     │                  │   │
│  │  │ Analysis    │  │ Orchestrator │  │ LLM Jury     │                  │   │
│  │  └──────────────┘  └──────────────┘  └──────────────┘                  │   │
│  │                                                                         │   │
│  └─────────────────────────────────────────────────────────────────────────┘   │
│                                        │                                        │
│                                        ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────────┐   │
│  │                         EXECUTION LAYER                                 │   │
│  │                                                                         │   │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                  │   │
│  │  │ Paper        │  │ Live         │  │ Broker       │                  │   │
│  │  │ Executor     │  │ Executor     │  │ Adapters     │                  │   │
│  │  │              │  │              │  │              │                  │   │
│  │  │ Simulated    │  │ Real Orders  │  │ Alpaca       │                  │   │
│  │  │ Trading      │  │ 4-Gate Check │  │ IBKR, Binance│                  │   │
│  │  └──────────────┘  └──────────────┘  └──────────────┘                  │   │
│  │                                                                         │   │
│  └─────────────────────────────────────────────────────────────────────────┘   │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘
```

---

## Integration Categories

### 1. Market Price Data

Primary sources for real-time and historical price data.

| Provider | Asset Classes | Key Features | Rate Limits | Auth |
|----------|--------------|--------------|-------------|------|
| **Yahoo Finance** | Stocks, ETFs, Crypto, Forex, Indices | OHLCV, company profiles | IP-based | None |
| **Finnhub** | Stocks, Forex | Real-time quotes, fundamentals | 60/min free | API Key |
| **CoinGecko** | Crypto | 8000+ coins, market cap, trending | 50/min free | Optional |
| **Polygon.io** | Stocks, Options, Forex, Crypto | Trades, quotes, aggregates | Tier-based | API Key + BYOK |
| **Binance** | Crypto | aggTrades, OHLCV, futures | 1200/min | None (public) |
| **Alpaca** | Stocks, Crypto | Real-time, historical | Account-based | OAuth |
| **Frankfurter** | Forex | ECB reference rates | Unlimited | None |

**Data Flow:**
```
Provider → price-service.ts → market_price_history / market_price_snapshots
                           → getHistoricalPrices() / getLivePrice()
                           → DSL evaluator / UI components
```

**Key File:** `packages/be/src/trading/data/price-service.ts`

---

### 2. Prediction Markets

Aggregated forecasting signals from crowd wisdom.

| Provider | Markets | Coverage | Status |
|----------|---------|----------|--------|
| **Polymarket** | 200+ | Crypto, politics, economy | Live |
| **Kalshi** | 100+ | US elections, economic events | Live |
| **Metaculus** | 1000+ | Long-term forecasts, science | Live |
| **PredictIt** | 50+ | US politics | Blocked (Cloudflare) |

**Data Flow:**
```
Providers → fetchAllPredictionMarkets() → market_events
                                       → All-Seeing Eye aggregation
                                       → Prediction market DSL primitives
```

**Key File:** `packages/be/src/all-seeing-eye/aggregation/prediction-markets.ts`

---

### 3. Derivatives & Volatility

Market risk indicators and derivatives data.

| Provider | Data | Use Case |
|----------|------|----------|
| **Yahoo VIX** | S&P 500 implied volatility | Crisis detection |
| **Binance Futures** | Funding rates, open interest | Crypto leverage |
| **Deribit** | BTC/ETH DVOL | Crypto vol surface |
| **Coinglass** | Liquidations | Leverage unwinding |
| **Alternative.me** | Fear & Greed Index | Sentiment extremes |

**Data Flow:**
```
Providers → derivatives.ts → derivatives_data
                          → DSL: vix(), funding_rate(), fear_greed()
                          → Black swan detector inputs
```

**Key File:** `packages/be/src/integrations/derivatives.ts`

---

### 4. Macro Economic

Government and institutional economic data.

| Provider | Data Series | Update Frequency |
|----------|-------------|------------------|
| **FRED** | Treasury yields, CPI, SOFR, employment | Daily/Monthly |
| **World Bank** | International GDP, CPI | Quarterly |
| **FXStreet** | Economic calendar | Real-time |
| **SEC EDGAR** | 13F, 13D/13G filings | As filed |

**Data Flow:**
```
FRED → overlays/data-fetcher.ts → overlay series (188+ time series)
                               → Yield curve spread calculations
                               → Treasury signal DSL primitives

SEC → institutional/ + activist/ → Holdings/positions tables
                                → DSL: institutional_ownership_pct(), has_activist_position()
```

**Key Files:**
- `packages/be/src/overlays/data-fetcher.ts`
- `packages/be/src/institutional/`
- `packages/be/src/activist/`

---

### 5. News & Events

Real-time news for sentiment and event detection.

| Provider | Type | Coverage |
|----------|------|----------|
| **NewsAPI** | Headlines | 80+ sources |
| **Finnhub News** | Market news | Financial focused |
| **GDELT** | Global events | Entity extraction |
| **Guardian/NYT** | Quality journalism | Deep analysis |
| **Google News RSS** | Aggregated | Broad coverage |

**Data Flow:**
```
Providers → news-collector.ts → news_archive
                             → RAG corpus for LLM context
                             → Event detection for triggers
```

**Key File:** `packages/be/src/scheduler/news-collector.ts`

---

### 6. Social Sentiment

Multi-platform social signal aggregation.

| Platform | Access | Data | Rate Limits |
|----------|--------|------|-------------|
| **Reddit** | Public API | Posts, comments, upvotes | 100/min |
| **Twitter/X** | Bearer token | Tweets, engagement | $200/mo tier |
| **Telegram** | User bot token | Channel messages | User-provided |
| **Discord** | User bot token | Server messages | User-provided |
| **Bluesky** | Public API | Posts, reposts | Liberal |
| **Truth Social** | Scraper | Posts | Fragile |
| **RSS** | Public | Any feed | Unlimited |

**Data Flow:**
```
Platforms → social-follows scheduler → sentiment_hourly (per-ticker rollup)
                                    → LLM sentiment scoring
                                    → Divergence engine (human vs bot)
                                    → DSL: social_sentiment(), reddit_mentions()
```

**Key File:** `packages/be/src/social/service.ts`

---

### 7. LLM Providers

AI-powered analysis and generation.

| Provider | Model | Use Case | Pricing |
|----------|-------|----------|---------|
| **Anthropic Claude** | claude-sonnet-4, claude-haiku-4 | Analysis, DSL translation | Per token |
| **Agencio LLMRouter** | Routed | Primary when configured | Platform |
| **Voyage AI** | voyage-3-lite | Text embeddings | Per token |
| **OpenAI** | text-embedding-3-small | Embedding fallback | Per token |

**Data Flow:**
```
User request → llm/client.ts → LLMRouter → Anthropic fallback
                            → Zod validation (anti-hallucination)
                            → Stock Hunter reports
                            → Algorithm DSL translation
                            → LLM Jury decisions
```

**Key File:** `packages/be/src/algorithms/llm/client.ts`

---

### 8. Broker Integrations

Trade execution across asset classes.

| Broker | Assets | Auth | Paper/Live |
|--------|--------|------|------------|
| **Alpaca** | Stocks, ETFs, Crypto | API Key | Both |
| **IBKR** | Multi-asset | Session | Both (Gateway required) |
| **Binance** | Crypto | HMAC | Both (testnet/prod) |
| **Pepperstone** | Forex, CFDs | OAuth 2.0 | Both |
| **Schwab** | US Equities, Options | OAuth 2.0 | Both (7-day token) |

**Data Flow:**
```
Algorithm signal → executor.ts → BrokerAdapter interface
                             → Preflight checks (4 gates)
                             → Order placement
                             → Position/fill tracking
                             → algorithm_trades table
```

**Key File:** `packages/be/src/brokers/service.ts`

---

## Integration Configuration

### Platform-Level Keys

Configured by admins at `/admin/integrations`:

```
┌─────────────────────────────────────────────────────────────┐
│                   Admin Integration UI                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  For each integration:                                      │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Status:  ● Configured  ○ Not Configured             │   │
│  │ Source:  [Database] / [Environment Variable]        │   │
│  │ Key:     ****abc123                                 │   │
│  │                                                     │   │
│  │ [Test Connection]  [Configure]  [Tier Info]        │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘
```

**Storage Priority:**
1. Database (`predict.platform_integration_secrets`) — encrypted
2. Environment variable fallback

### User BYOK Keys

Users can provide their own keys at `/settings/ai-billing/byok`:

| Provider | BYOK Supported | Use Case |
|----------|---------------|----------|
| Anthropic Claude | Yes | Personal LLM quota |
| OpenAI | Yes | Personal embeddings |
| Polygon.io | Yes | Personal market data quota |

**Key Priority:**
1. User BYOK key (if available)
2. Platform key (fallback)

---

## Enrichment & Consumption

### How Data Enriches the Platform

```
┌────────────────────────────────────────────────────────────────────────────┐
│                        DATA ENRICHMENT MATRIX                               │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                            │
│  DATA SOURCE          ENRICHES                 CONSUMED BY                 │
│  ─────────────        ────────                 ───────────                 │
│                                                                            │
│  Yahoo/Polygon        Price history            • DSL price() primitive     │
│  prices               Technical indicators     • Pattern detection         │
│                                                • Backtest engine           │
│                                                • Live quotes UI            │
│                                                                            │
│  Prediction           Crowd probability        • All-Seeing Eye           │
│  markets              Event outcomes           • DSL prediction_market()   │
│                                                • Market probability UI     │
│                                                                            │
│  Social               Sentiment scores         • Divergence engine         │
│  sentiment            Mention counts           • DSL social_sentiment()    │
│                                                • Sentiment charts          │
│                                                                            │
│  FRED                 Yield curves             • Bond overlays             │
│  economic             Economic indicators      • Black swan detector       │
│                                                • DSL yield_spread()        │
│                                                                            │
│  SEC                  Institutional ownership  • Stock Hunter reports      │
│  13F/13D              Activist positions       • DSL institutional_*()     │
│                                                • Holdings UI               │
│                                                                            │
│  Derivatives          VIX, funding rates       • Crisis detection          │
│                       Open interest            • DSL vix(), funding_rate() │
│                                                • Derivatives dashboard     │
│                                                                            │
│  LLM                  Natural language         • Stock Hunter analysis     │
│  Claude               DSL translation          • Algorithm translation     │
│                       Report generation        • Support chat RAG          │
│                                                                            │
│  Brokers              Order execution          • Paper/live trading        │
│                       Position sync            • Portfolio tracking        │
│                                                • P&L calculation           │
│                                                                            │
└────────────────────────────────────────────────────────────────────────────┘
```

### DSL Primitive Mapping

| Category | Primitives | Data Sources |
|----------|------------|--------------|
| **Price** | `price()`, `sma()`, `rsi()`, `macd()` | Yahoo, Polygon, Finnhub |
| **Sentiment** | `social_sentiment()`, `news_sentiment()` | Social platforms, NewsAPI |
| **Prediction** | `prediction_market()`, `polymarket_*()` | Polymarket, Kalshi, Metaculus |
| **Macro** | `yield_spread()`, `fed_funds()`, `inflation()` | FRED |
| **Institutional** | `institutional_ownership_pct()`, `smart_money_*()` | SEC 13F |
| **Activist** | `has_activist_position()`, `activist_count()` | SEC 13D/13G |
| **Derivatives** | `vix()`, `funding_rate()`, `fear_greed()` | Yahoo, Binance, Alt.me |
| **Pattern** | `is_fvg()`, `is_hammer()`, `dtw_similarity()` | Computed from prices |
| **Manipulation** | `is_stop_hunt()`, `manipulation_risk_score()` | Binance L2, Polygon trades |

---

## Fallback & Resilience

### Provider Fallback Chains

```
Price Data:    Alpaca → Yahoo → Finnhub
Crypto:        Alpaca → Yahoo → CoinGecko → Binance
Forex:         Polygon → Yahoo → Frankfurter
Embeddings:    Voyage → OpenAI
LLM:           LLMRouter → Anthropic Direct
```

### Graceful Degradation

When a provider fails:
1. Log warning with error details
2. Try next provider in chain
3. Return empty array/null if all fail
4. UI shows "Data unavailable" message

---

## Monitoring & Observability

### Integration Health

| Metric | Location | Alert Threshold |
|--------|----------|-----------------|
| API response time | CloudWatch | > 5s |
| Error rate | Structured logs | > 5% |
| Rate limit hits | Rate limiter logs | > 80% capacity |
| BYOK validation failures | byok-service logs | Any |

### Admin Dashboard

`/admin/integrations` provides:
- Configuration status for each integration
- Live connection test buttons
- Tier detection (Polygon)
- Last successful sync timestamps

---

## Security Considerations

### API Key Protection

| Layer | Protection |
|-------|------------|
| Storage | AES-256-GCM encryption |
| Transit | HTTPS only |
| Memory | Keys not logged |
| UI | Only last 4 chars shown |
| Access | Admin-only configuration |

### User BYOK Security

- Per-user salt for encryption
- Keys never stored in plaintext
- Validation before storage
- Automatic invalidation on repeated failures

---

## Related Documentation

- [Integration Registry](./internal/integrations/26-integration-registry.md) — Detailed provider catalog
- [Platform Integration Secrets](./internal/integrations/73-platform-integration-secrets.md) — Key management
- [Polygon BYOK & Tier Detection](./84-polygon-byok-tier-detection.md) — Polygon-specific details
- [API Routes](./internal/architecture/24-api-routes.md) — All API endpoints
- [Data Feed LLM OpEx](./internal/ai-ml/25-data-feed-llm-opex.md) — Cost projections
