WC 2026 · Forecasting Oxford Football Forecasting

Method · global squad coverage

Seeing the whole field

Public football analytics runs on club data from the rich leagues of Western Europe — a lens that covers most of a French or Spanish squad and almost none of a Qatari or South African one. This project extends the squad layer to a genuinely global club panel, and the gain lands precisely where top-five-only work is blind. A placebo test shows it is the coverage itself doing the work.

Oxford Football Forecasting · Research Source · Oxford Football Forecasting model ≈ 7 min

Every forecasting model is only as good as what it can see — and what football analytics can usually see is strikingly lopsided. The club data behind most squad-quality work, this project’s early squad layer included, covers the top five European leagues: England, Spain, Germany, Italy, France. For a France or a Spain, that is almost the entire 26-man squad. For an Uzbekistan, a Qatar, a South Africa, it is one player, or none. On that footing, 73% of the average European squad is visible against 17% of the average Asian one.

On its face that looks like a coverage gap — a nuisance, not a bias. It is in fact a bias, and a subtle one. Presence in a top-five league is itself a proxy for wealth and football development: rich, well-resourced national teams stock the Premier League and LaLiga; poorer ones do not. So a model trained on top-five coverage does not merely miss data for non-European squads — it learns a correlation that runs the wrong way. Low coverage becomes a feature, and the feature reads weak team. The blind spot is shaped exactly like a continent.

“Low coverage” is not missing data. It is a feature — and a top-five-only model reads it as “weak team.”

This project’s contribution is not a cleverer algorithm but a wider lens. The squad layer was extended from the top-five panel to a genuinely global one — API-Football, spanning 68 countries and 107 leagues — and the squad-form and match-fitness signals were recomputed from it. The rest of this essay shows, with the project’s own numbers, three things: that the coverage gap was real and continent-shaped; that closing it moved the model; and that the movement came from seeing more players, not from quietly re-weighting which leagues count.

Fig. R1.1 World choropleth · share of each squad with matched club data

The blind spot, drawn on a map

Toggle between BEFORE (Understat, top-5 Europe) and AFTER (global API-Football). Before, almost everything outside Western Europe is pale — not because those squads are weak, but because their players play in leagues the feed never saw. After, the map fills in.

Map loads when it scrolls into view. The confederation bars below carry the same finding without it.

England and Scotland share one shape (both are GBR in ISO-3166); it is tinted with England’s coverage. The bars and table below treat every nation individually.

Coverage rises from 48% to 85% of the average squad — and the lift is largest exactly where the top-five feed was blind: OFC 15%→92%, AFC 17%→71%. UEFA, already well covered, barely moves.

Source · Understat (top-5 Europe) · API-Football (global club coverage)

Put numbers on it. Across all 48 squads, the share of players the feed could actually see rose from 47.8% to 85.0% — a +37-point jump. But the average hides the whole point, which is where the jump landed. Break it down by confederation and the correction is unmistakably targeted: the single Oceanian side goes from 15% to 92% coverage; Asia’s nine teams from 17% to 71%. Europe’s sixteen, already near-complete, nudge from 73% to 92%.

Fig. R1.2 Coverage before → after, weakest-covered confederation first

Who the global feed actually helped

OFC · 1 team +77pt
15 92%
AFC · 9 teams +54pt
17 71%
CONCACAF · 6 teams +50pt
30 81%
CAF · 10 teams +44pt
45 88%
CONMEBOL · 6 teams +30pt
54 85%
UEFA · 16 teams +19pt
73 92%
All 48 48% → 85%
before (top-5 Europe) after (global panel)

This is not a uniform shift; it is the correction of a continent-shaped blind spot. The confederations the top-five feed under-saw — OFC, AFC, CONCACAF — gain the most; UEFA, already near-complete, gains least. Every confederation lands in a flat 71%–92% band.

Source · Understat (top-5 Europe) · API-Football (global club coverage)

The individual cases are starker than the confederation means. Qatar carried a single top-five player into the tournament — coverage 4% — and the global panel sees 100% of the squad. South Africa and Saudi Arabia make almost the same leap. These were not weak squads the old model rightly discounted; they were strong squads the old model could not see. Two European teams — Netherlands and Germany — actually dip a little, because the global match window is stricter than Understat’s full season. Coverage moves in both directions under the same rule — a uniform window, not a thumb on the scale.

Fig. R1.3 Out-of-sample RPS · lower is better · 152 backtest matches

What the global panel did to the model: worst to best-of-its-kind

The gradient-boosted goal model, scored two ways: with the top-5 squad signal, and with the global one. The Elo floor and the de-vigged market ceiling are drawn as reference lines. Switching the feed moves the GBM up almost the entire distance between the bottom and the blend.

top-5 squad signal global squad signal BEFORE AFTER Elo floor 0.1938 Market ceiling 0.1905 Ensemble blend 0.1891 0.1937 0.1921 LightGBM-Poisson

Among the 5 standalone learners (excluding the ensemble blend and the market benchmark), the global GBM ranks #1; the top-5 GBM ranked #4. That is the “worst to best-non-market” shorthand the rest of the site uses, in full.

The top-5 GBM (0.1937) sat at the bottom of the table, all but tied with the bare Elo floor (0.1938). The global GBM (0.1921) becomes the single strongest learner below the ensemble and the market — a −0.0017 RPS gain from the feed alone.

Source · Oxford Football Forecasting model — out-of-sample backtest, expanding protocol. Floor = Elo-only; ceiling = de-vigged market.

Switching feeds did two things at once. It added players the model had never seen — the coverage effect we have been describing. But it also changed the mix of leagues the squad signal was built from, and leagues are weighted by strength. It is entirely possible that the model improved not because it saw more of the world, but because the new league weighting happened to flatter the backtest. If that were true, the “de-biasing” story would be a coincidence dressed up as a correction.

So we ran the placebo. Hold the coverage fixed — show the model no extra players — and only re-weight the leagues the way the global panel implies. If league re-weighting were doing the work, this alone should reproduce the lift. It did not. The gain showed up only when coverage actually increased, and it concentrated where coverage increased most: off-UEFA, on the low-coverage squads. On the backtest’s low-coverage stratum the global model beats the Elo floor by −0.0048 RPS — directionally exactly what a genuine coverage fix should produce, and exactly what a league-weighting artefact should not.

The placebo is the whole point. Re-weighting leagues without adding players did not reproduce the gain — so it was coverage, not the weighting.

How the placebo was constructed (and how unseen players are handled)

The coverage A/B is cleanest inside the gradient-boosted model, where the squad-form features enter as match-level differences. The placebo arm re-derives the league-strength coefficients from the global panel but feeds the model the same set of observed players as the top-5 arm — isolating the weighting change from the coverage change. The lift did not survive that isolation.

One implementation detail matters: club form remains genuinely unobserved for roughly 56% of player-rows even after the global feed. The model treats those as truly missing — when a squad’s club form is unseen, it falls back on history (Elo) and market value rather than imputing a number. So the global panel adds signal where it exists and stays silent where it does not; it never invents a player.

Two true sentences, side by side. The first: extending the squad layer to a global panel moved the gradient-boosted model from the worst learner in the field to the best single model below the ensemble blend and the market, and it did so by seeing more of the world rather than by re-weighting it — a real, placebo-controlled, directionally-predicted result. The second: with only three out-of-sample tournaments of evidence, the bootstrap interval on that gain comfortably includes zero. The effect is real and the significance is not, and both belong in the same paragraph.

This is why the project’s headline is the modest one it is. The fully-pooled ensemble matches the de-vigged market on out-of-sample RPS — 0.1891 against 0.1905 — and does not significantly beat it. The global panel is not the thing that supports a market-beating claim; nothing here does. What it buys is something quieter and more defensible: a squad signal that is no longer blind to four-fifths of the planet, and a model that earns its keep precisely on the teams the market’s thin information serves worst.