WC 2026 · Forecasting Oxford Football Forecasting

Tournament Tournament · the 1.1M-run engine

Monte-Carlo simulator — playing the bracket a million times

Wraps the frozen ensemble match model and plays the real 48-team tournament 1.1 million times — sampling scorelines, applying the full FIFA tiebreakers and the eight-best-thirds rule, then knockouts with calibrated extra time and shootouts — with a Monte-Carlo standard error on every probability.

OOS RPS · expanding

no match-RPS — wraps the ensemble

OOS RPS · LOTO

not applicable

1.1M

Tournament runs

of the official 48-team bracket

Wraps the frozen ensemble match model and plays the real 48-team tournament 1.1M times: sample scorelines for all 104 fixtures, apply the full FIFA tiebreakers and the 8-best-thirds rule, run knockouts with calibrated extra-time and shootouts, and tally how often each team reaches each stage. Every probability carries a Monte-Carlo standard error so simulation noise is never mistaken for skill.

The champion probability as a Monte-Carlo frequency over N runs

01Sampling a match

For each of the 104 fixtures a full scoreline is drawn from the ensemble grid — not just a result, because goal difference and goals scored decide group tiebreaks. Every grid is renormalised before sampling, so simulation noise can never leak in as fake probability.

Scorelines drawn from the renormalised per-fixture grids

02Groups and the best thirds

The twelve group tables are built with the full FIFA tiebreak chain, applied in order until ties resolve. The top two of each group advance, and the eight best third-placed teams are slotted into the round of 32 by FIFA’s Annex C bracket table — the highest-risk piece of bracket logic, verified against the official mapping with zero disagreements across all 495 possible third-place combinations.

The tiebreak order — within groups and across the twelve thirds

03Extra time and shootouts

A drawn knockout match goes to an independent extra-time grid at one third of the regulation goal rate — thirty minutes of the ninety. A surviving tie goes to a shootout modelled as near 50/50 with a small Elo tilt, fitted on 678 real shootouts and capped so that even a 400-point rating gap moves the shootout only to about 60–40: penalty kicks are the irreducible luck floor of knockout football.

Calibrated extra time, and a shootout that stays close to a coin flip

04The estimate and its error

Every probability on this site is a Monte-Carlo frequency with a binomial standard error. At the locked 100,000-run forecast the worst case is 0.16 percentage points — small, quantified, and printed beside the figures it belongs to.

The Monte-Carlo estimate and its standard error

05Conservation checks

Two identities hold by construction and are asserted on every run: exactly one champion per simulation, exactly 32 teams in the round of 32. If either ever failed, the build would fail with it.

Probability is conserved across the 48 teams

Symbol key

the frozen ensemble scoreline grid for fixture f
the number of simulated tournaments (100,000 locked + 1.0M re-draws)
a single simulated tournament (one full bracket play-through)
the indicator — 1 if team i won simulation s, else 0
the extra-time goal rate — one third of the regulation rate (30 of 90 minutes)
the shootout Elo tilt — fitted on 678 real shootouts and capped small
team i’s estimated probability — the Monte-Carlo frequency
the Monte-Carlo standard error reported beside every probability
  • Frozen ensemble scoreline grids for all 104 fixtures
  • Official 48-team structure + the FIFA Annex C best-thirds table
  • 678 real penalty shootouts for the shootout calibration

Fig. M·Monte Conceptual schematic

Monte-Carlo simulator — wired end to end

ensemble grid per fixture sample scores 104 matches group table FIFA tiebreaks 8 best thirds Annex C · 0/495 KO + ET + pens shootouts cal. champion % ± MC-SE N = 1,100,000 runs · renormalize every grid before sampling · ΣP(champion)=1
Source · Oxford Football Forecasting model · structural diagram, not a data plot

Fig. V11 Lower is better · floor = Elo-only · ceiling = de-vigged market

OOS RPS — expanding (headline) and LOTO (optimistic)

The simulator has no match-level RPS — it wraps the frozen ensemble match model and is scored on tournament stage probabilities instead.

The simulator is validated on the tournament outcome space — the ordinal stage reached and the champion — not on match W/D/L. Its match-level skill is the ensemble’s (0.1891 expanding RPS). See the validation page for stage-level scoring and the convergence checks.

Its skill is inherited: the simulator can only be as good as the ensemble it plays out, plus irreducible knockout variance.

Source · Oxford Football Forecasting model · Bookmaker consensus (de-vigged closing odds) · 152 matches · 3 tournaments

The simulator freezes the ensemble match model and plays the real bracket. For each of the 104 fixtures it renormalizes the scoreline grid and samples a result; it builds the twelve group tables with the full FIFA tiebreaker chain (points → goal difference → goals → head-to-head → fair play → drawing of lots); it ranks the third-placed teams and selects the eight best thirds by FIFA’s Annex C table; then it runs the knockouts with extra time at a 30/90 goal-rate and shootouts calibrated to real base rates, all the way to a champion. Repeating this 1,100,000 times turns into a probability for every team reaching every stage.

1.1M

Tournaments simulated

100k locked forecast + 1.0M power re-draws

0 / 495

Best-thirds disagreements

our Annex C parse vs the reference table — unit-tested first

678

Shootouts calibrated

real penalty shootouts set the base rate

104

Fixtures per simulation

the official 48-team schedule, renormalized grids

Fig. M·Sim Conceptual schematic · one run, repeated 1.1M times

The pipeline — grid to champion

ensemble grid per fixture sample scores 104 matches group table FIFA tiebreaks 8 best thirds Annex C · 0/495 KO + ET + pens shootouts cal. champion % ± MC-SE N = 1,100,000 runs · renormalize every grid before sampling · ΣP(champion)=1
Source · Oxford Football Forecasting model — 1.1M tournament simulations · structural diagram, not a data plot
Integrity & the irreducible knockout floor

Every grid is renormalized before sampling (the build asserts the probabilities sum to one to within 1e-9), so simulation noise never leaks in as fake skill. The conservation checks hold by construction — the champion probabilities sum to one, the round-of-32 counts sum to 32 — and probability stability was re-checked at 50k vs 100k runs, within the Monte-Carlo standard error reported beside every figure. The simulator inherits the match model’s ceiling and adds the irreducible variance of knockout football: a one-off shootout caps how accurate any pre-tournament forecast can be.

Strengths

  • Real bracket wiring + full FIFA tiebreakers (0/495 thirds errors)
  • MC-SE on every probability
  • Renormalizes every grid before sampling (integrity-checked)

Limits

  • Inherits the match model's skill ceiling
  • Irreducible knockout variance caps achievable accuracy
  • Pre-tournament lock; no refit after kickoff