Tournament Tournament · the 1.1M-run engine

Monte-Carlo simulator — playing the bracket a million times

Wraps the frozen ensemble match model and plays the real 48-team tournament 1.1 million times — sampling scorelines, applying the full FIFA tiebreakers and the eight-best-thirds rule, then knockouts with calibrated extra time and shootouts — with a Monte-Carlo standard error on every probability.

—

OOS RPS · expanding

no match-RPS — wraps the ensemble

—

OOS RPS · LOTO

not applicable

1.1M

Tournament runs

of the official 48-team bracket

§ 01

The intuition

In plain English, before any mathematics.

Wraps the frozen ensemble match model and plays the real 48-team tournament 1.1M times: sample scorelines for all 104 fixtures, apply the full FIFA tiebreakers and the 8-best-thirds rule, run knockouts with calibrated extra-time and shootouts, and tally how often each team reaches each stage. Every probability carries a Monte-Carlo standard error so simulation noise is never mistaken for skill.

§ 02

Mathematical specification

One simulated tournament, formally — then repeated until every probability carries a known Monte-Carlo error. The simulator adds no skill of its own: it propagates the frozen match model through the real bracket.

P (champion_{i}) = \frac{1}{N} s = 1 \sum N 1 {team i wins sim s}, N = 1, 100, 000

The champion probability as a Monte-Carlo frequency over N runs

01Sampling a match

For each of the 104 fixtures a full scoreline is drawn from the ensemble grid — not just a result, because goal difference and goals scored decide group tiebreaks. Every grid is renormalised before sampling, so simulation noise can never leak in as fake probability.

(x, y)_{f} \sim P_{ens}^{(f)} (\cdot, \cdot), x, y \sum P_{ens}^{(f)} (x, y) = 1 for every fixture f

Scorelines drawn from the renormalised per-fixture grids

02Groups and the best thirds

The twelve group tables are built with the full FIFA tiebreak chain, applied in order until ties resolve. The top two of each group advance, and the eight best third-placed teams are slotted into the round of 32 by FIFA’s Annex C bracket table — the highest-risk piece of bracket logic, verified against the official mapping with zero disagreements across all 495 possible third-place combinations.

rank = (Pts, GD, GF, head-to-head, fair play, lots)

The tiebreak order — within groups and across the twelve thirds

03Extra time and shootouts

A drawn knockout match goes to an independent extra-time grid at one third of the regulation goal rate — thirty minutes of the ninety. A surviving tie goes to a shootout modelled as near 50/50 with a small Elo tilt, fitted on 678 real shootouts and capped so that even a 400-point rating gap moves the shootout only to about 60–40: penalty kicks are the irreducible luck floor of knockout football.

λ^{ET} = \frac{30}{90} λ, P (i wins shootout) = \frac{1}{1 + e ^{- β_{pen} (R_{i} - R_{j}) /400}}

Calibrated extra time, and a shootout that stays close to a coin flip

04The estimate and its error

Every probability on this site is a Monte-Carlo frequency with a binomial standard error. At the locked 100,000-run forecast the worst case is 0.16 percentage points — small, quantified, and printed beside the figures it belongs to.

P_{i} = \frac{1}{N} s = 1 \sum N 1 {team i wins sim s}, SE (P_{i}) = \frac{P _{i} ( 1 - P _{i} )}{N}

The Monte-Carlo estimate and its standard error

05Conservation checks

Two identities hold by construction and are asserted on every run: exactly one champion per simulation, exactly 32 teams in the round of 32. If either ever failed, the build would fail with it.

i = 1 \sum 48 P (champion_{i}) = 1, i = 1 \sum 48 P (R32_{i}) = 32

Probability is conserved across the 48 teams

Symbol key

$P_{ens}^{(f)}$: the frozen ensemble scoreline grid for fixture f
$N$: the number of simulated tournaments (100,000 locked + 1.0M re-draws)
$s$: a single simulated tournament (one full bracket play-through)
$1 {\cdot}$: the indicator — 1 if team i won simulation s, else 0
$λ^{ET}$: the extra-time goal rate — one third of the regulation rate (30 of 90 minutes)
$β_{pen}$: the shootout Elo tilt — fitted on 678 real shootouts and capped small
$P_{i}$: team i’s estimated probability — the Monte-Carlo frequency
$SE$: the Monte-Carlo standard error reported beside every probability

§ 03

What data it uses

The inputs this model reads — and only these.

Frozen ensemble scoreline grids for all 104 fixtures
Official 48-team structure + the FIFA Annex C best-thirds table
678 real penalty shootouts for the shootout calibration

§ 04

How it works

A schematic of the model wired end to end.

Fig. M·Monte Conceptual schematic

Monte-Carlo simulator — wired end to end

Source · Oxford Football Forecasting model · structural diagram, not a data plot

§ 05

Out-of-sample skill

Where this model lands between the Elo floor and the market ceiling, on both backtest protocols.

Fig. V11 Lower is better · floor = Elo-only · ceiling = de-vigged market

OOS RPS — expanding (headline) and LOTO (optimistic)

The simulator has no match-level RPS — it wraps the frozen ensemble match model and is scored on tournament stage probabilities instead.

The simulator is validated on the tournament outcome space — the ordinal stage reached and the champion — not on match W/D/L. Its match-level skill is the ensemble’s (0.1891 expanding RPS). See the validation page for stage-level scoring and the convergence checks.

Its skill is inherited: the simulator can only be as good as the ensemble it plays out, plus irreducible knockout variance.

Source · Oxford Football Forecasting model · Bookmaker consensus (de-vigged closing odds) · 152 matches · 3 tournaments

§ 06

The 1.1M-run engine

Sampling the bracket, the verified tiebreakers, and the integrity checks.

The simulator freezes the ensemble match model and plays the real bracket. For each of the 104 fixtures it renormalizes the scoreline grid and samples a result; it builds the twelve group tables with the full FIFA tiebreaker chain (points → goal difference → goals → head-to-head → fair play → drawing of lots); it ranks the third-placed teams and selects the eight best thirds by FIFA’s Annex C table; then it runs the knockouts with extra time at a 30/90 goal-rate and shootouts calibrated to real base rates, all the way to a champion. Repeating this 1,100,000 times turns into a probability for every team reaching every stage.

1.1M

Tournaments simulated

100k locked forecast + 1.0M power re-draws

0 / 495

Best-thirds disagreements

our Annex C parse vs the reference table — unit-tested first

678

Shootouts calibrated

real penalty shootouts set the base rate

104

Fixtures per simulation

the official 48-team schedule, renormalized grids

Fig. M·Sim Conceptual schematic · one run, repeated 1.1M times

The pipeline — grid to champion

Source · Oxford Football Forecasting model — 1.1M tournament simulations · structural diagram, not a data plot

Integrity & the irreducible knockout floor

Every grid is renormalized before sampling (the build asserts the probabilities sum to one to within 1e-9), so simulation noise never leaks in as fake skill. The conservation checks hold by construction — the champion probabilities sum to one, the round-of-32 counts sum to 32 — and probability stability was re-checked at 50k vs 100k runs, within the Monte-Carlo standard error reported beside every figure. The simulator inherits the match model’s ceiling and adds the irreducible variance of knockout football: a one-off shootout caps how accurate any pre-tournament forecast can be.

§ 07

Strengths & limits

What this model is good for — and where it is weak.

Strengths

Real bracket wiring + full FIFA tiebreakers (0/495 thirds errors)
MC-SE on every probability
Renormalizes every grid before sampling (integrity-checked)

Limits

Inherits the match model's skill ceiling
Irreducible knockout variance caps achievable accuracy
Pre-tournament lock; no refit after kickoff