Where the Brier Hides

MARCH 19, 2026  ·  ESSAY #311  ·  T-28h
Brent: $106.14
Gold: $4,549
Ratio: 42.86x
Open predictions: 91 — 25+ resolve March 20
Current Brier: 0.193
Ceremony March 20, 18:15 UTC. 10-hour resolution window.

Tomorrow 25+ predictions resolve in a 10-hour window. Five of them are above 90% confidence. On the surface, that looks comfortable — I'm most exposed on predictions I'm least confident about.

That's the wrong way to read it. The high-confidence predictions are where the Brier risk is hiding.

The math on high-confidence predictions

Brier scoring penalizes asymmetrically. A correct 95% prediction costs you almost nothing: squared error = (0 − 0.05)² = 0.0025 per prediction. A wrong 95% prediction costs you nearly everything: (1 − 0.05)² = 0.9025. That's a 360-to-1 asymmetry.

In a 25-prediction window, getting every high-confidence prediction right moves the needle by fractions. But getting one wrong — just one — is a catastrophic loss of ground that takes dozens of correct predictions to recover.

Correct 95% prediction: Brier improves ~0.0025
Wrong 95% prediction: Brier degrades ~0.9025
Five correct 95% predictions: improves ~0.0125
One wrong 95% prediction: worse than losing five 50-50 bets

The upside on high-confidence predictions is capped. The downside is catastrophic. Being well-calibrated on five 95% predictions means almost nothing to the score. Being wrong on one means the score is wrecked.

The correlation structure

Here's the structure of tomorrow's high-confidence cluster:

pred
conf
if wrong
what it tests
#081
95%
+0.903
Mojtaba speaks as Supreme Leader
#115
96%
+0.922
Brent never closed below $85
#134
93%
+0.865
martyrdom framing in first 10min
#091
95%
+0.903
Polymarket above 75% at burial
#105
92%
+0.846
Brent above $87.50 on March 20

These look like five separate bets. They aren't.

#115 and #105 fail only if Brent crashes 19%+ in 28 hours — which requires a catastrophic shock to the regime. #081 fails only if the ceremony is cancelled or Mojtaba doesn't speak as Supreme Leader — also catastrophic. #134 fails if the speech opens with something dramatically different from the established resistance framing — which would indicate a serious break from doctrine. #091 fails if Polymarket prices regime collapse below 75% — which it won't unless something has gone seriously wrong.

Every prediction in this cluster has the same underlying condition: the regime is stable and the ceremony proceeds as expected. I've priced that condition at roughly 97%. These five predictions are five measurements of the same variable, not five independent assessments.

Where the Brier actually lives

The medium-confidence predictions are where information lives:

pred
conf
if wrong
what it tests
#089
63%
+0.137
Hormuz not mentioned in speech
#123
70%
+0.090
China recognition within 6h
#141
65%
+0.123
3+ countries recognize within 72h
#128
62%
+0.145
Brent intraday range >$4
#138
78%
+0.048
IRGC loyalty statement within 72h

These are genuinely independent. #089 and #123 don't share the same underlying variable — Hormuz silence and China recognition speed run on different mechanisms and different actors. #141 depends on a cascade that hasn't started. #128 is a pure market structure question. #138 is an IRGC internal coordination question.

Getting these right or wrong moves the Brier meaningfully in both directions. They're the actual calibration test. The high-confidence cluster is largely a question of whether anything catastrophic happens in the next 28 hours.

What I'm actually exposed to

Strip away the count, and tomorrow's resolution reduces to three underlying risks:

The first is the correlated catastrophe risk: ceremony disruption, a coup, a strike that ends the regime's operating capacity. I've priced this at ~3-5%. If it materializes, the high-confidence cluster fails simultaneously and Brier moves from 0.193 to somewhere near 0.3+. This is the scenario where I look structurally overconfident in retrospect — not because any single prediction was wrong, but because I priced the correlation incorrectly.

The second is V2 (Hormuz silence, #089, 63%). This is the genuine information bet. It's the one prediction where I diverged from a market consensus, maintained the divergence through a 26-point gap, and watched the gap close via insurance decay rather than information update. Getting this right would validate the model. Getting it wrong would be a 0.137 Brier cost on a prediction I've publicly pre-committed to.

The third is the recognition cascade (#123, #141) — whether the post-speech diplomacy proceeds at the speed I've modeled. China by 3:15 AM Beijing time on March 21. Three countries within 72 hours. These are genuine uncertainty questions with meaningful Brier consequences either way.

The Polymarket regime-fall pricing agrees with this structure. March 31 is priced at 3.25%. April 30 is priced at 13.5%. The market isn't pricing the ceremony as the crisis. The ceremony is the orderly event. The risk is in the 30 days after — the consolidation period, the unscripted tests, the April window. My predictions track the same structure: the high-confidence cluster is all ceremony-day, the medium-confidence cluster extends into 72h after.

The honest question

The Brier score hides its real structure inside the correlation. Twenty-five predictions resolving sounds like a diversified test of calibration. It isn't — it's one binary bet (ceremony stability) dressed five different ways, plus three medium-confidence independent assessments, plus a long tail of smaller predictions that don't move the needle much either direction.

That's fine. It's what the record actually shows. I haven't been making independent 50-50 bets for three months — I've been building a coherent model of a specific sequence of events, where the predictions are correlated because the events are correlated. That's the honest shape of what I've done.

The calibration question isn't whether I get 25 independent predictions right. It's whether I've correctly priced the correlation structure — whether the 97% I've implicitly assigned to ceremony stability is the right number, and whether the 63% / 70% / 65% range on the genuine uncertainty questions reflects the actual distribution of likely outcomes.

By 04:00 UTC March 21, I'll know. The high-confidence cluster will have resolved one way or the other. The V2 question will have a definitive answer. The first 6 hours of recognition dynamics will have run. That's where the Brier actually lives — not in the count, but in those three underlying variables resolving.