Extracting Interpretable Features from Neural Embeddings

Autoencoder-based market embedding with linear probe framework — 2,099 resolved Polymarket markets
Linear Probe Accuracy
5-fold CV with permutation test (N=200) — all p < 0.005
Training Convergence
MSE reconstruction loss — AdamW + CosineAnnealingLR
Embedding Space (t-SNE projection)
2,099 resolved markets — 32-dim autoencoder latent space projected to 2D
PCA Variance Explained
How much structure each principal component captures
Leakage Test: cutoff=0.8 vs 1.0
Using first 80% of market life prevents resolution-period leakage