Polymarket Wallet Clustering

Fraud Detection & Manipulation Analysis for Prediction Markets

System Overview

The wallet clustering system detects coordinated trading behavior on Polymarket by analyzing on-chain transaction patterns. It identifies suspicious clusters of wallets that may be engaging in market manipulation.

5
Connection Types
6
Suspicion Reasons
10
DSL Primitives
6h
Bet Sync Interval

Manipulation Types Detected

Data Ingestion Pipeline

Bet data flows from Polymarket's CLOB API through our ingestion pipeline into the clustering analysis engine.

flowchart TB subgraph Polymarket["Polymarket CLOB API"] CLOB["/trades endpoint"] end subgraph Scheduler["Scheduler Jobs"] SYNC["sync-polymarket-bets
(Every 6 hours)"] INC["wallet-clustering-incremental
(Daily)"] FULL["wallet-clustering-full-scan
(Weekly)"] end subgraph Ingestion["Data Ingestion"] FETCH["fetchAllRecentTrades()
Rate-limited pagination"] CONVERT["clobTradeToBetInput()
Normalize trade data"] UPSERT["upsertBet()
upsertWallet()"] end subgraph Database["PostgreSQL Tables"] WALLETS[("pm_wallets
Address, cluster_id,
suspicion_score")] BETS[("pm_wallet_bets
wallet_address, market_id,
outcome, amount")] CLUSTERS[("pm_wallet_clusters
wallet_count, suspicion_score,
top_markets")] CONNECTIONS[("pm_wallet_connections
wallet_a, wallet_b,
connection_type, strength")] end subgraph Analysis["Clustering Analysis"] UF["Union-Find Algorithm"] CONN["Connection Detection"] SCORE["Suspicion Scoring"] end CLOB --> SYNC SYNC --> FETCH FETCH --> CONVERT CONVERT --> UPSERT UPSERT --> WALLETS UPSERT --> BETS INC --> UF FULL --> UF WALLETS --> CONN BETS --> CONN CONN --> CONNECTIONS UF --> CLUSTERS CONNECTIONS --> SCORE SCORE --> CLUSTERS SCORE --> WALLETS style SYNC fill:#a371f7,color:#000 style SCORE fill:#f85149,color:#fff

CLOB Trade Structure

Field Type Description
idstringUnique trade ID
taker_addressstringWallet placing the bet (primary for analysis)
maker_addressstringCounterparty wallet
marketstringMarket/condition ID
outcomeYes/NoBet outcome side
sizestringNumber of shares
pricestringPrice per share
timestampnumberUnix timestamp
transaction_hashstringOn-chain tx hash

Connection Types

The system detects 5 types of connections between wallets, each indicating different coordination patterns.

Funding

Same wallet funded both addresses. Tracks multi-hop funding chains up to N depth.

High Signal

Timing

Bets placed within 30 seconds of each other on the same market.

Strong Signal

Sizing

Identical USD bet amounts (to the cent) across multiple bets.

Strong Signal

Market Focus

Jaccard index >= 0.3 on markets traded. Similar market selection patterns.

Moderate Signal

Win Pattern

Correlation >= 0.7 on bet outcomes. Both win or lose together consistently.

Moderate Signal

flowchart LR subgraph Detection["Connection Detection"] F["Funding Analysis
On-chain tracing"] T["Timing Analysis
getCoincidentBets()"] S["Sizing Analysis
Match USD amounts"] M["Market Focus
Jaccard overlap"] W["Win Pattern
Outcome correlation"] end subgraph Evidence["Evidence Objects"] FE["FundingEvidence
commonFunder, hopDistance"] TE["TimingEvidence
avgDeltaSeconds, correlationCoeff"] SE["SizingEvidence
matchingAmounts, matchRatio"] ME["MarketFocusEvidence
jaccardIndex, sharedMarkets"] WE["WinPatternEvidence
correlationCoeff, bothWonCount"] end subgraph Output["Connection Record"] CONN["PMConnection
walletA, walletB
connectionType
strength (0-1)"] end F --> FE --> CONN T --> TE --> CONN S --> SE --> CONN M --> ME --> CONN W --> WE --> CONN

Clustering Algorithm

Wallets are grouped into clusters using Union-Find (disjoint set) algorithm based on detected connections.

flowchart TB subgraph Input["Input Data"] W1["Wallet A"] W2["Wallet B"] W3["Wallet C"] W4["Wallet D"] W5["Wallet E"] end subgraph Connections["Detected Connections"] C1["A ↔ B
Timing: 0.8"] C2["A ↔ C
Funding: 0.9"] C3["D ↔ E
Sizing: 0.7"] end subgraph UnionFind["Union-Find"] UF1["Union(A, B)"] UF2["Union(A, C)
→ B joins cluster"] UF3["Union(D, E)"] end subgraph Clusters["Resulting Clusters"] CL1["Cluster 1
A, B, C"] CL2["Cluster 2
D, E"] end W1 --> C1 W2 --> C1 W1 --> C2 W3 --> C2 W4 --> C3 W5 --> C3 C1 --> UF1 C2 --> UF2 C3 --> UF3 UF1 --> CL1 UF2 --> CL1 UF3 --> CL2 style CL1 fill:#f85149,color:#fff style CL2 fill:#f0883e,color:#000

Suspicion Scoring

Each wallet and cluster is assigned a suspicion score (0-1) based on multiple indicators.

Suspicion Reasons

Reason Threshold Weight
abnormal_win_rate Cluster win rate > 70% High
high_coordination Many strong connections (> 5) High
funding_chain Multi-hop funding obscuration High
size_pattern Too-identical bet sizing (> 50% match) Medium
timing_coordination Avg timing delta < 30 seconds Medium
market_concentration Cluster dominates specific markets (> 30%) Medium

Cluster Metrics

Field Description
walletCountNumber of wallets in cluster
suspicionScoreAggregate score 0-1
suspicionReasonsArray of triggered reasons
combinedWinRateCluster-wide win rate
totalVolumeUsdTotal betting volume
totalBetsTotal number of bets
topMarketsMarkets where cluster is most active
avgBetTimingSecondsAverage timing between coordinated bets

DSL Primitives

10 DSL primitives expose wallet clustering data to the Algorithm Builder.

Wallet-Based Primitives

Primitive Returns Description
polymarket_wallet_suspicion(address) number Suspicion score for specific wallet (0-1)
polymarket_cluster_size(address) number Number of wallets in same cluster
polymarket_wallet_win_rate(address) number Historical win rate for wallet
polymarket_is_flagged(address) boolean True if wallet marked suspicious
polymarket_connection_count(address) number Number of connections to other wallets

Market-Based Primitives

Primitive Returns Description
polymarket_smart_money_consensus(market) number Consensus among high-win-rate wallets (-1 to +1)
polymarket_manipulation_risk(market) number Risk score based on cluster activity (0-1)
polymarket_whale_flow(market) number Net flow from high-volume wallets
polymarket_has_suspicious_activity(market) boolean True if suspicious clusters active in market
polymarket_coordination_score(market) number Degree of coordinated activity (0-1)

Example Usage

// Avoid markets with high manipulation risk
entry: polymarket_manipulation_risk("btc-100k-2024") < 0.3 && polymarket_smart_money_consensus("btc-100k-2024") > 0.5

// Filter for clean signals
exit: polymarket_has_suspicious_activity("fed-rate-cut-june") == false

Scheduler Jobs

Three scheduler jobs maintain the wallet clustering system.

Job Interval Description
sync-polymarket-bets 6 hours Fetches recent trades from CLOB API, ingests into pm_wallet_bets
wallet-clustering-incremental Daily Analyzes recently active wallets, updates connections and clusters
wallet-clustering-full-scan Weekly Comprehensive analysis of all top wallets by volume
gantt title Wallet Clustering Schedule dateFormat HH:mm axisFormat %H:%M section Bet Sync sync-polymarket-bets :a1, 00:00, 30m sync-polymarket-bets :a2, 06:00, 30m sync-polymarket-bets :a3, 12:00, 30m sync-polymarket-bets :a4, 18:00, 30m section Clustering wallet-clustering-incremental :b1, 02:00, 2h

API Endpoints

Endpoint Method Description
/api/predict/v1/wallet-clustering/wallets GET List suspicious wallets
/api/predict/v1/wallet-clustering/wallets/:address GET Get wallet details and connections
/api/predict/v1/wallet-clustering/clusters GET List suspicious clusters
/api/predict/v1/wallet-clustering/clusters/:id GET Get cluster details with member wallets
/api/predict/v1/wallet-clustering/market/:marketId GET Get cluster activity in a specific market
/api/predict/v1/wallet-clustering/scans GET List recent scan history
/api/predict/v1/admin/wallet-clustering/scan POST Trigger manual scan (admin only)

All-Seeing Eye Integration

Wallet clustering data feeds into the All-Seeing Eye aggregation layer.

flowchart LR subgraph WalletClustering["Wallet Clustering"] WC[fetchAllWalletClusteringData] end subgraph ASE["All-Seeing Eye"] AGG[aggregateAllData] INS[Insight Generator] ACT[Action Generator] end subgraph Outputs["Outputs"] DSL[DSL Primitives] ALERT[Fraud Alerts] UI[Dashboard Widget] end WC -->|UnifiedDataPoint[]| AGG AGG --> INS INS --> ACT WC --> DSL ACT --> ALERT INS --> UI

UnifiedDataPoint Fields

Wallet clustering provides data points with these characteristics:

Key Files

File Purpose
packages/be/src/wallet-clustering/types.tsTypeScript interfaces
packages/be/src/wallet-clustering/repository.tsDatabase CRUD operations
packages/be/src/wallet-clustering/service.tsBusiness logic, scan orchestration
packages/be/src/wallet-clustering/dsl-wrappers.tsDSL primitive implementations
packages/be/src/integrations/polymarket.tsCLOB API integration
packages/be/src/all-seeing-eye/aggregation/wallet-clustering.tsASE aggregation bridge
db/migrations/255_wallet_clustering_schema.sqlDatabase schema