Baseline Baseline · the calibrated goal kernel

Dixon-Coles — goals as Poisson, with the draws fixed

Each team’s goals are a Poisson count from its attack and the opponent’s defence, with a low-score correction that repairs the classic under-prediction of 0-0 and 1-1 draws. The validated kernel and the simulator’s default engine.

0.1926

OOS RPS · expanding

the headline skill (realism protocol)

0.1828

OOS RPS · LOTO

optimistic ceiling (leaks future folds)

Leaderboard rank

of 7 · CI 0.1528–0.2292

§ 01

The intuition

In plain English, before any mathematics.

Models each team's goals as a Poisson count driven by attack and defence strengths, with a low-score correction (the tau term) that fixes the classic under-prediction of 0-0 and 1-1 draws, and an exponential time-decay so recent matches count more. The validated baseline kernel and cheapest simulator input.

§ 02

Mathematical specification

From two team strengths to a full scoreline distribution: the goal rates, the corrected joint probability, and the time-decayed likelihood the parameters are fitted by.

λ_{i} = exp (μ + h_{adv} \cdot 1_{host} + α_{i} - β_{j}), P (X = x, Y = y) = τ_{λ_{i}, λ_{j}} (x, y) \frac{e ^{- λ_{i}} λ _{i}^{x}}{x !} \frac{e ^{- λ_{j}} λ _{j}^{y}}{y !}

The goal rate, and the corrected bivariate scoreline probability

01The goal rates

Each side’s expected goals combine the global scoring level μ, its own attack, and the opponent’s defence. The host bonus fires only for the United States, Canada and Mexico in their own stadiums; every other fixture is treated as neutral. Attack and defence are centred to mean zero so the parameters are identified.

lo g λ_{i} = μ + h_{adv} 1_{host} + α_{i} - β_{j}, lo g λ_{j} = μ + α_{j} - β_{i}

Two log-linear Poisson goal rates per fixture

02The corrected scoreline probability

Independent Poissons get low-scoring draws wrong. The τ factor reweights exactly four cells of the scoreline grid — 0-0, 1-0, 0-1 and 1-1 — and leaves everything else untouched.

P (X = x, Y = y) = τ_{λ_{i}, λ_{j}} (x, y) \frac{e ^{- λ_{i}} λ _{i}^{x}}{x !} \frac{e ^{- λ_{j}} λ _{j}^{y}}{y !}

The joint scoreline probability with the low-score correction

03The τ correction

With ρ negative, the two draw cells are lifted and the two narrow wins are trimmed — repairing the classic under-prediction of 0-0 and 1-1. ρ is a single shared parameter, clipped to ±0.2.

τ_{λ, μ} (x, y) = ⎩ ⎨ ⎧ 1 - λ μ ρ 1 + λ ρ 1 + μ ρ 1 - ρ 1 (x, y) = (0, 0) (x, y) = (0, 1) (x, y) = (1, 0) (x, y) = (1, 1) otherwise

The Dixon–Coles τ on the four low-score cells

04The fitted objective

Parameters maximise a time-decayed log-likelihood: every match is weighted by its age at ξ = 0.0019 per day — a one-year half-life, slower than the club-football convention because international sides play sparsely. A small ridge keeps the strengths of rarely-seen teams finite.

\hat{θ} = ar g θ max m \sum e^{- ξ t_{m}} lo g P_{θ} (x_{m}, y_{m}) - ℓ_{2} i \sum (α_{i}^{2} + β_{i}^{2})

Time-decay-weighted maximum likelihood with a ridge

05From grid to forecast

The grid runs to 10 goals a side and is renormalised; win, draw and loss probabilities — and totals, handicaps and exact scores — are all read off the same object.

P (H) = x > y \sum P (x, y), P (D) = x = y \sum P (x, y), P (A) = x < y \sum P (x, y)

Every betting margin is a deterministic sum over the grid

Symbol key

$μ$: the global mean log goal rate
$α_{i}$: team i’s attack strength (centred to mean zero)
$β_{j}$: team j’s defence strength (centred to mean zero)
$h_{adv} \cdot 1_{host}$: host advantage, gated to the three host nations only
$λ_{i}, λ_{j}$: the resulting Poisson goal rates of the two teams
$τ_{λ_{i}, λ_{j}}$: the Dixon–Coles low-score correction on the 0-0 / 1-0 / 0-1 / 1-1 cells
$ρ$: the dependence parameter of the correction, clipped to ±0.2
$ξ$: time-decay per day (0.0019 — a one-year half-life), weighting recent matches more
$t_{m}$: how many days before the fit cut match m was played
$ℓ_{2}$: a small ridge penalty keeping rarely-seen teams’ strengths finite

§ 03

What data it uses

The inputs this model reads — and only these.

Goal counts from 49,445 international results
Exponential time-decay - recent matches weigh more (one-year half-life)
Host gating (USA/CAN/MEX in their own stadiums only)

§ 04

How it works

A schematic of the model wired end to end.

Fig. M·Dixon Conceptual schematic

Dixon-Coles — wired end to end

Source · Oxford Football Forecasting model · structural diagram, not a data plot

§ 05

Out-of-sample skill

Where this model lands between the Elo floor and the market ceiling, on both backtest protocols.

Fig. V11 Lower is better · floor = Elo-only · ceiling = de-vigged market

OOS RPS — expanding (headline) and LOTO (optimistic)

On the headline expanding window this model scores 0.1926 — −0.0012 below the Elo floor (0.1938) and +0.0021 versus the market ceiling (0.1905).

Expanding 0.1926

LOTO 0.1828

Bar fills to the model’s RPS on the floor–ceiling axis; the whisker on the expanding bar is the conservative 95% CI (0.1528–0.2292). Lower (left) is better.

It clears the Elo floor; the gap to the market is small and — at n = 3 — inside the bootstrap interval.

Source · Oxford Football Forecasting model · Bookmaker consensus (de-vigged closing odds) · 152 matches · 3 tournaments

§ 06

Strengths & limits

What this model is good for — and where it is weak.

Strengths

Calibrated draw probabilities (tau correction)
Emits a full scoreline grid, not just W/D/L
Fast — the simulator's default kernel

Limits

Cold-starts unseen teams to a grand mean
Linear attack/defence; no rich squad features
Single global dependence parameter