Model Card
chainsight-local-v0Model card per Mitchell et al. (2019). Educational research only — not designed for live trading and not financial advice.Heads
Head | Spec | Implementation | Target |
|---|---|---|---|
| Directional | §6.1 | LightGBM binary, isotonic-OOF calibration | P(label = +1) over triple-barrier; horizons {7, 30} |
| Volatility | §6.4 | Corsi (2009) HAR-RV baseline + LightGBM residual | Forward log-realised vol over horizon |
| Quantile | §6.3 | Three independent LightGBM-quantile regressors (row-wise rearrangement; CQR (Romano 2019) on the acceptance-gate metric only, not yet on served bands) | Forward log-return quantiles {P10, P50, P90} |
| Regime | §6.2 | Gaussian HMM (Viterbi) → LightGBM 4-class classifier | Regime ∈ {markdown, accumulation, markup, distribution} |
| Meta-blender | §6.5/§6.6 | Logistic stacker on per-row OOF base predictions + isotonic | Calibrated blended P(up) |
| Tail risk | §6.5 | Logistic regression on [p_up_dir, regime_probs, vix_z, dxy_z, real_yield_10y_z] | P(>20% drawdown within 30d) |
Spec §7.4 acceptance gates · 4 / 6 green (live) · 1 pending
- Directional log-loss< 0.69 (raw OOF)· 0.691
- Directional ECE< 0.05 (after isotonic)· 0.022
- Meta-blender ECE< 0.05 (after isotonic)· 0.008
- Quantile 80% coverage[0.75, 0.85] (raw OOF)· 0.81
- Volatility QLIKE≥ 0.1 vs HAR-RV (OOF)· 6.5%
- Regime cross-entropy< 1.1 (OOF)· 0.712
- Tail-risk calibration|pred − realised| < 0.05 (OOF)· —
Model Health
Directional ECE
0.022target < 0.05Calibration of the directional head: the gap between predicted probabilities and reality. Lower = when it says "70% up", BTC actually rose ~70% of the time.OOF Log-Loss
0.691target < 0.69How well the directional model scores on held-out data. The 0.69 (ln 2) ceiling is the coin-flip baseline; below it means the model beats random.Feature Drift
3 / 15|z| > 2 features todayCount of inputs that look unusually extreme vs their history. A handful is normal; a third or more is the cue to retrain.Coverage 80%
0.81target [0.75, 0.85]Share of past days the realised price landed inside the 80% forecast cone. Should sit near 0.80 — far off means the cone is mis-sized.What it means
Four quick self-checks on the model's quality, measured on held-out (never-trained-on) data. For the calibration, log-loss, and coverage gates, the accent color means the check passed and red means it failed. Feature drift is a monitoring signal, not a §7.4 gate: accent = healthy, amber = drift building (watch), red = time to retrain.
Why it matters
These tell you whether to trust the headline forecasts above. If calibration or log-loss go red, the probabilities are unreliable; if drift climbs, the inputs have shifted away from what the model learned on. Full breakdown lives on the model-card page.
Limitations
Off-chain price discovery (CEXes, derivatives) dominates Bitcoin short-term price formation. Expect a hard ceiling around 55–57% directional accuracy on a 7d horizon — this system pursues calibration first and discrimination second. All seven §7.4 gates are computed nightly from purged-k-fold OOF arrays at training time and persisted to `model_health` (2026-06-29). A gate renders “pending” only when its source OOF series is degenerate — most commonly tail-risk calibration on the synthetic fixture, which lacks >=20% drawdowns.