Finding · the projection residual g

History versus the squad

Two ways to measure a national team disagree more often than you would think. One asks what it has done — decades of results, distilled into a rating. The other asks what it is worth — the transfer-market price of the eleven it puts on the pitch. The gap between them has a name in this project, and a number. It is a description rather than a law, and the essay reads it that way.

Oxford Football Forecasting · Research Source · Oxford Football Forecasting model ≈ 6 min

Most of the time, a team’s record and its squad value tell the same story. Argentina have a towering history and one of the most expensive squads at the tournament; Haiti have neither. Plot every nation’s history against its squad value and the cloud tilts cleanly along a line — richer squads belong to teams with better records, roughly in proportion. That line is the expectation. The interesting teams are the ones that miss it.

The decoupling g is precisely how far a team misses that line. Formally it is the projection residual of a squad’s market value on its history-based strength: take what a team’s track record predicts its squad should be worth, and subtract it from what the squad is actually worth. A team sitting exactly on the line has g = 0. Sit above it — a squad more valuable than your record predicts — and g is positive. Sit below it — a strong record carried by a comparatively inexpensive squad — and g is negative.

g is the distance from the line where history and squad value agree. The story is who sits off it — and which way.

The Bayesian hierarchical model does not bolt this on as an afterthought; it emits g as a first-class quantity with its own credible interval, conditioned so that a newcomer with a thin record is not over-read. Across the 48 finalists the residual runs from −0.72 (Australia, the most over-achieving record relative to squad price) to +0.75 (Portugal, the squad most richly valued relative to its history). Everyone else sits between.

Fig. R2.1 History (x) versus squad value (y) · 48 finalists · z-scored

The line where history and squad value agree — and who sits off it

Each team is a point: across is its history-based strength, up is its squad market value. The diagonal is the fit of value on history — the price a team's record predicts. A team's g is its signed vertical distance from that line: above it (blue) the squad is dearer than the record implies; below it (red) the record outruns the price.

above the line · g > 0 · richer than its record below the line · g < 0 · record outruns the price stems drawn for labelled teams = the residual itself

Portugal and Qatar sit highest above the line — squads valued well beyond their record. Australia and Morocco sit furthest below — strong histories on comparatively cheap squads. Most teams hug the line; the decoupling is a tail phenomenon.

Source · Oxford Football Forecasting model — the 48 team dossiers. The line is the ordinary-least-squares fit of squad value on history — the exact object g is the residual of.

Take the two poles literally. Portugal top the list at g = +0.75: an elite record, and a squad valued even higher than that record would predict — a side whose market price has run ahead of its results, not behind them. Qatar reach almost the same figure (+0.74) from the opposite direction: a modest history, but a squad the market prices well above what that history alone implies. They are the tournament’s two clearest cases of value exceeding record.

At the other end sit the over-achievers. Australia (−0.72) and Morocco (−0.58) carry strong tournament records on squads the transfer market values comparatively cheaply — exactly the profile of a team that wins more than its price tag says it should. Morocco’s semi-final run in 2022 is the archetype: a history that towers over the squad’s sticker value. Negative g is the number for that.

Fig. R2.2a Decoupling g = +0.75 ± 0.07

Portugal — value above the record

Read right-to-left: a marker on the right means squad value runs ahead of the record; on the left, the record runs ahead of the value.

Portugal sit at the positive pole: the squad is valued above what the history predicts.

Source · Oxford Football Forecasting model. Axis domain = the field's observed g range.

Fig. R2.2b Decoupling g = −0.72 ± 0.05

Australia — record above the value

The ±1 SD band is the model's own uncertainty about where this team sits — wider for teams it has seen less of.

Australia sit at the negative pole: a strong record on a squad the market prices below it.

Source · Oxford Football Forecasting model. Axis domain = the field's observed g range.

Fig. R2.3 Every finalist ranked by g · diverging from the line

The whole field, richest-for-record to cheapest-for-record

The same residual, for all 48, ordered top to bottom. Bars to the right (blue) are squads valued above their record; bars to the left (red) are records that outrun their squad's price. The spine is g = 0 — perfect agreement between history and value.

Portugal and Qatar anchor the top; Australia and Morocco the bottom. Argentina (+0.21), Brazil (+0.39) and Spain (−0.18) sit near the middle — their squads are priced about where their records say they should be.

Source · Oxford Football Forecasting model — all 48 finalists, ranked by g. The diverging ramp matches the site's draw-luck palette (blue = value-rich, red = record-rich).

Does the gap actually predict anything?

A residual is only interesting if it carries information about what happens next. So we asked: do teams whose squads outrun their record go further than their record alone would say?

It is one thing to label a team’s g; it is another to claim it matters. The test is direct. Across five tournaments and 118 team-tournaments, regress how far a team actually got — the stage it reached — on its decoupling g, controlling for raw history so we are isolating the residual’s effect, not history’s. If g carries genuine information, its coefficient should be positive: squads that outrun their record should, on average, outrun it on the pitch too.

The coefficient is positive. It is also small, and its interval is wide enough to swallow zero whole. The point estimate is b = +0.031, with a tournament-clustered standard error of 0.17 and a confidence interval running from −0.31 to +0.55. Zero sits comfortably inside it.

Fig. R2.4 stage reached ~ g · 118 team-tournaments · 5 tournaments

The slope on over-performance — positive, but its interval spans zero

b = +0.031: the sign is what the decoupling story predicts, but the confidence interval [−0.31, +0.55] includes 0. Read it as a subgroup shift, not a structural slope.

Source · Oxford Football Forecasting model — OLS of stage-reached on g with an Elo control; tournament-clustered SE and a block-bootstrap CI.

The sign is the story’s; the significance is not. Read g as a subgroup shift, never as a slope you can bank.

This is the moment to be careful rather than triumphant. The coefficient points the way the decoupling idea says it should — squads that outrun their record do tend to go a little further — and that directional agreement is worth something. But at five tournaments the evidence cannot separate that tendency from noise, and a single team’s g is not a prediction you should stake anything on. Spain carry a slightly negative residual (−0.18) and remain one of the two title favourites; the residual modulates the story, it does not overturn the ratings that drive it.

The right framing is the one the team pages already use: g is a descriptive read, a way of seeing which squads the market prices above or below their record — not a structural slope you can multiply through a forecast. As a sanity check, the model-based g correlates with the engineered value-minus-history gap at 0.23, so it is measuring the thing it claims to measure. It just cannot, on this much data, promise you what that thing will do next June.

Why g is identified at all (and why we don't quote a structural split)

History and squad value are strongly correlated — good teams have good records and expensive squads — so a naive attempt to separate their effects is weakly identified. The Bayesian model sidesteps a false-precision split: rather than claim “X% of strength is history and Y% is squad,” it reports g as a well-conditioned projection residual with a credible interval, and reports the posterior correlation between the two channels rather than a structural percentage. That is the conservative object — and it is why the tournament-level slope is reported with its interval rather than promoted to a headline.