International results
martj42/international_results (GitHub)
Every men's A-international — results, goalscorers and penalty shootouts.
§ Data · the transparency layer
Every probability on this site is drawn from one locked, fully traceable dataset. This page opens the whole store: the 20 raw sources behind the forecast, the coverage map that changed the result, a dictionary for all 35 engineered features, the league-strength normalisation, the pipeline lineage — and a live SQL console that runs in your browser over all 12 tables.
Raw data sources
807.9 MB across 583 files
Pipeline scripts
raw → processed → models → the locked forecast
Engineered features
per team, grouped into five families
Queryable tables
74,722+ cells, live in-browser SQL
Source · project data inventory — sizes and file counts measured from the raw data store; features and table cells counted from the published dataset.
§ 01
Twenty raw inputs, from a 49,445-match results spine to a 52k-quote odds archive. Each card shows what it is, who provides it, its span, its size and the role it plays in the forecast. Two sources are shown as attempted-but-deferred.
martj42/international_results (GitHub)
Every men's A-international — results, goalscorers and penalty shootouts.
eloratings.net
Current world Elo for 244 teams plus year-end ratings back to 1950 — the published anchor we also recompute reproducibly from results.
fixturedownload.com + Wikipedia
The official 104-fixture / 12-group / 48-team structure with venues and dates, and the 26-man squads (number, position, dob, caps, goals, club) for all 1,245 selected players.
Transfermarkt
National-team squad market values for all 48 teams (2,278 players × value, age, caps, goals) — England €1.88bn down to Jordan €16m — plus the 2016→2024 value history.
Wikipedia / federation pages
Each nation's coach, their nationality, the foreign-coach flag and appointment date — feeds the coaches page and every dossier header.
SoFIFA / EA FC (via soccerdata)
Club EA-FC ratings (96 clubs) as a per-player quality reference. Player-level ratings proved redundant — fully superseded by Transfermarkt value.
Understat (via soccerdata)
Club-level expected goals (xg, np_xg, xg_chain, xg_buildup) for the top-5 European leagues, extended back to 2016-17 so each backtest tournament has its club season.
API-Football
Global club appearances, minutes and goals across 107 leagues in 68 countries — the genuinely worldwide club panel that de-biased the squad layer away from a Europe-only feed.
statsbomb/open-data (GitHub)
Full event streams and lineups for recent major tournaments (WC2018/2022, Euro2020/2024, Copa2024) — the public alternative to licensed Opta/Wyscout event data.
FBref (via soccerdata)
Tournament schedules and player/team season stats (goals, assists, shots). Season tables carry no xG — that is derived from StatsBomb instead.
the-odds-api (Pro)
WC2026 outright and full match markets (h2h, totals, spreads), plus the out-of-time closing-odds consensus for the 152 backtest matches — the de-vigged market benchmark.
Wikipedia (past tournaments)
As-of squads for five past tournaments (WC2018/2022, Euro2020/2024, Copa2024) so every engineered feature can be rebuilt leakage-safe for the out-of-time backtest.
World Bank API
GDP, GDP-per-capita and population for 261 countries — the macro prior and the GDP-growth context features.
curated venues + open-elevation + open-meteo
The 16 host venues with coordinates, elevation, June–July heat index and a full travel-distance matrix — hottest Houston 47°C, highest Mexico City 2,287m.
US Census (ACS B05006 / B16001)
Foreign-born population by origin and home-language speakers in the United States — the novel quasi-home-support and host-language-familiarity signals.
UN DESA migration
Bilateral international migrant-stock — the broader diaspora mechanism behind quasi-home support, complementing the US-specific census layer.
Eurostat COFOG
Government recreation-and-sport spending as a share of GDP and its trend — an investment-context control, available for EU/EEA nations only.
open-meteo (per club city)
The home-climate each player is acclimatised to, aggregated to a squad acclimatisation temperature and differenced against host-venue heat.
Transfermarkt injury pages
All 48 national-team injury pages were scraped, but returned 0 parseable entries three days before kickoff. Availability is already encoded in the finalised 26-man squads (injured players omitted).
inside.fifa.com
Deferred — the page loads its ranking history client-side, and Elo is the stronger, fully-reproducible state variable used as the primary prior.
Sizes, file counts and spans are measured directly from the data store at build time, so the catalog cannot drift from the data it describes.
Two sources are shown as attempted-but-deferred: the FIFA-ranking history, because Elo is the stronger, fully-reproducible rating prior; and the injury feed, which returned no usable rows three days before kick-off — availability is instead encoded in the finalised 26-man squads (injured players were simply omitted). Both are re-runnable mid-tournament.
§ 02
Public club-performance feeds cover the top-5 European leagues — and top-5 presence is itself a wealth proxy, so models built on them quietly learn “non-European ⇒ weak”. This project extends the squad layer to a genuinely global club panel (API-Football, 107 leagues), lifting coverage everywhere outside Europe and flipping the gradient-boosted model from worst to best-non-market.
Fig. V10 World choropleth · share of each squad with matched club data
Toggle between BEFORE (Understat top-5-Europe) and AFTER (global API-Football). Before, almost everything outside Western Europe is pale — not because those squads are weak, but because their players play in leagues the feed never saw. After, the map fills in. The map is interactive (drag, zoom); the same numbers are in the table below.
Map loads when it scrolls into view. The confederation bars and the per-team table below carry the same finding without it.
England and Scotland share one shape on the map (both are GBR in ISO-3166); it is tinted with
England’s coverage. Cabo Verde and Curaçao are small but present. The table below lists every nation individually.
Overall coverage rises from 48% to 85% — and the lift is largest exactly where the old feed was blind: OFC 15%→92%, AFC 17%→71%. UEFA, already well-covered, barely moves.
Fig. V10b Before → after, weakest-covered confederation first
The gain is not a uniform shift — it lands precisely where top-5 European feeds see least. The confederations they under-cover (OFC, AFC, CONCACAF) gain the most; UEFA, already near-complete, gains least.
The full per-team coverage, biggest gainer first. South Africa, Qatar and Saudi Arabia were nearly invisible to a top-5-only feed (one or zero players); the global panel sees almost their entire squad. The two UEFA teams at the bottom dip a touch because the global match window is stricter than Understat’s season — coverage, not invention.
| Nation | In top-5 | Before | After | Uplift |
|---|---|---|---|---|
| Qatar | 1 | +96pt | ||
| South Africa | 1 | +92pt | ||
| Saudi Arabia | 1 | +88pt | ||
| New Zealand | 4 | +77pt | ||
| Korea Republic | 4 | +70pt | ||
| Egypt | 4 | +70pt | ||
| Panama | 2 | +69pt | ||
| Cabo Verde | 4 | +66pt | ||
| Mexico | 8 | +61pt | ||
| Czechia | 10 | +58pt | ||
| Canada | 6 | +52pt | ||
| Türkiye | 8 | +50pt | ||
| Uzbekistan | 0 | +50pt | ||
| Morocco | 12 | +50pt | ||
| Curaçao | 5 | +46pt | ||
| IR Iran | 4 | +43pt | ||
| Haiti | 9 | +42pt | ||
| Paraguay | 9 | +42pt | ||
| Jordan | 0 | +42pt | ||
| Bosnia and Herzegovina | 12 | +39pt | ||
| Australia | 11 | +39pt | ||
| Ecuador | 8 | +38pt | ||
| Algeria | 13 | +38pt | ||
| Uruguay | 13 | +35pt | ||
| Iraq | 3 | +34pt | ||
| Brazil | 16 | +34pt | ||
| USA | 17 | +31pt | ||
| Norway | 17 | +31pt | ||
| Tunisia | 9 | +30pt | ||
| Colombia | 15 | +27pt | ||
| Côte d'Ivoire | 18 | +27pt | ||
| Japan | 16 | +26pt | ||
| Scotland | 16 | +26pt | ||
| Sweden | 16 | +23pt | ||
| Senegal | 20 | +23pt | ||
| Ghana | 17 | +23pt | ||
| Congo DR | 18 | +19pt | ||
| Croatia | 21 | +15pt | ||
| Belgium | 20 | +15pt | ||
| Austria | 21 | +12pt | ||
| Spain | 23 | +12pt | ||
| Portugal | 21 | +11pt | ||
| England | 23 | +8pt | ||
| France | 22 | +7pt | ||
| Argentina | 23 | +4pt | ||
| Switzerland | 23 | +4pt | ||
| Germany | 25 | −0pt | ||
| Netherlands | 25 | −8pt |
Source · Understat (top-5 Europe) vs API-Football (global club coverage). Sort any column; the bars show the share of each squad with matched club data.
§ 03
The 35 engineered features the models read, grouped into five families — history, squad quality, form & fitness, context, and the decoupling g. Each has a plain definition, its source, its observed range and a distribution sparkline. (The gradient-boosted model builds match-level differences x_i − x_j from this panel at train time.)
Current Elo rating
Current Elo rating (eloratings.net method, end-2026).
src Elo ratings (eloratings.net method)elo_now
Elo change over 5 years
Elo change over last 5 years.
src Elo ratings (eloratings.net method)elo_change_5y
Elo change over 10 years
Elo change over last 10 years.
src Elo ratings (eloratings.net method)elo_change_10y
Linear Elo slope per year (10y)
Linear Elo slope per year over 10y.
src Elo ratings (eloratings.net method)elo_slope_10y
Points per game, last 15 NT matches
National-team points-per-game, last 15 matches.
src International match results, 1872–2026nt_last15_ppg
Goal difference per game, last 15
National-team goal-difference per game, last 15.
src International match results, 1872–2026nt_last15_gd_pg
Win rate, last 15 NT matches
National-team win rate, last 15.
src International match results, 1872–2026nt_last15_winrate
NT recent-form trend
Recent NT form trend (slope).
src International match results, 1872–2026nt_form_trend
Mean squad age
Mean squad age.
src Official squad announcementsmean_age
Share of squad aged 25-29 (peak)
Share of squad aged 25–29 (peak).
src Official squad announcementspct_peak_25_29
# players over 30
Players over 30.
src Official squad announcementsn_over30
# players under 23
Players under 23.
src Official squad announcementsn_under23
Mean caps (experience)
Mean international caps.
src Official squad announcementsmean_caps
Largest single-club bloc
Largest same-club bloc size.
src Official squad announcementslargest_club_bloc
# distinct clubs in squad
Distinct clubs in squad.
src Official squad announcementsn_distinct_clubs
Total squad market value (EUR)
Total squad market value (Transfermarkt, EUR).
src Transfermarktsquad_value_eur
Squad-value growth (8y CAGR)
Squad value growth, 8y.
src Transfermarktvalue_cagr_8y
Squad xG+xA per 90 (Understat top-5)
4 n/aSquad club xG+xA per 90 (Understat top-5).
src Understatsquad_form_xgxa_per90
# squad players with Understat top-5 data
Coverage of top-5-league form data (0–1).
src Understatsquad_form_coverage
Squad fitness readiness (top-5)
4 n/aTop-5-league minutes-based readiness.
src Oxford Football Forecasting modelsquad_fitness_readiness
Mean club minutes per game
4 n/aMean club minutes per game.
src API-Football (global club coverage)mean_minutes_per_game
Squad fitness (global)
GLOBAL (de-biased) fitness readiness.
src API-Football (global club coverage)squad_fitness_global
Global fitness coverage (fraction)
Coverage of global fitness data (0–1).
src API-Football (global club coverage)fitness_coverage_global
Squad club form (global, league-weighted)
GLOBAL (de-biased) squad club form.
src API-Football (global club coverage)squad_form_global
Squad club form (global, unweighted)
Global form before league normalisation.
src API-Football (global club coverage)squad_form_global_raw
Mean league-strength coefficient of squad clubs
Mean league-strength coefficient of squad's clubs.
src Oxford Football Forecasting modelmean_league_coef
Global club-form coverage (fraction)
Coverage of global form data (0–1).
src API-Football (global club coverage)form_coverage_global
GDP compound annual growth, 5y
GDP compound annual growth, 5y (World Bank).
src World Bankgdp_cagr_5y
GDP compound annual growth, 10y
GDP compound annual growth, 10y.
src World Bankgdp_cagr_10y
Squad familiarity / chemistry index
Shared-club chemistry index.
src Oxford Football Forecasting modelfamiliarity_index
# shared-club player pairs
Count of same-club player pairs.
src Oxford Football Forecasting modelshared_club_pairs
Minute-weighted familiarity index
Co-tenure-weighted chemistry.
src Oxford Football Forecasting modelfamiliarity_weighted_index
Climate-adaptation gap to host venues
1 n/aSquad acclimatisation vs host max-temp (°C).
src Open-Meteo & venue recordsclimate_gap
z(NT form) - z(squad value): over/under-performance
Form residual vs market value.
src Oxford Football Forecasting modelform_vs_value_gap
z(Elo) - z(squad value): history vs market
Elo residual vs market value.
src Oxford Football Forecasting modelelo_vs_value_gap
Source · Oxford Football Forecasting model — the engineered feature panel (48 teams × 35 numeric features). Sparkline = a 12-bin histogram across the 48 nations.
A player’s club form means more in a stronger league. Each of the 60 club leagues gets a relative-strength coefficient, estimated from market-implied club quality and shrunk toward a confederation prior for thin leagues. England tops it; the coefficient is what re-weights every player’s minutes and goals before they roll up into the squad-form signal. The top 12 are shown — query the explorer below for all 60.
| League (country) | Clubs priced | Strength (z) | Form coef. | Example clubs |
|---|---|---|---|---|
| England | 236 | +2.21 | 1.8 | Fulham, Brentford, Manchester City |
| Spain | 111 | +2.13 | 1.8 | Rayo Vallecano, Real Betis, Sevilla |
| Germany | 141 | +1.84 | 1.8 | SV Darmstadt 98, Eintracht Frankfurt, Bayer Leverkusen |
| Italy | 82 | +1.70 | 1.8 | Empoli, Atalanta, Cittadella |
| France | 95 | +1.70 | 1.8 | Marseille, Lille, Nice |
| Portugal | 44 | +1.14 | 1.49 | Famalicao, Benfica, Sporting CP |
| Brazil | 25 | +1.03 | 1.433 | Palmeiras, Santos, Atletico Paranaense |
| Netherlands | 55 | +0.74 | 1.297 | Feyenoord, Twente, Ajax |
| Turkey | 44 | +0.49 | 1.186 | Beşiktaş, Sivasspor, Bursaspor |
| Croatia | 14 | +0.34 | 1.124 | Dinamo Zagreb, HNK Rijeka, HNK Hajduk Split |
| Russia | 21 | +0.27 | 1.098 | Lokomotiv, Rubin, Baltika |
| Austria | 10 | +0.26 | 1.096 | Grazer AK, Lask Linz, Red Bull Salzburg |
Source · Oxford Football Forecasting model — league-strength estimates for all 60 leagues. Strength (z) is standardised; the form coefficient re-weights club form in the squad roll-up.
The lineage, end to end: 20 raw sources are harmonised and engineered by 61 scripts into the processed panels, which feed the model ladder, which the simulator plays 1.1M times into the locked forecast this whole site reads.
Fig. V20 One direction, one path
There is exactly one path from data to page: every figure on this site is produced from the same locked forecast — which is why every number reconciles.
§ 04
Query the full dataset live, entirely in your browser — no server involved, nothing uploaded. Run the worked examples, write your own SQL over all 12 tables, browse any table point-and-click, and export the result as CSV. Below it, the full schema is documented so the data layer is complete even with JavaScript off.
SQL runs in your browser — the engine loads on your first query.
Press Run (or ⌘/Ctrl + Enter) to execute the query above. Results — sortable, exportable — appear here.
Or browse a table:
The console runs a full SQL engine inside your browser — it loads on your
first query, and nothing you type or compute ever leaves your machine. It is
read-only: the published tables cannot be altered, and a
stray UPDATE simply returns an error. Result tables are capped
at 1,000 rows in the page for speed — the CSV export carries the full set.
If the engine cannot load, the schema below still documents every table,
column and type.
Every queryable table, its row count, what it holds, and (where documented) each column with its type and meaning. This is the no-JavaScript-complete reference for the explorer above.
forecast 48 rows · 46 cols The locked champion / stage probabilities + group-stage odds (the single source of truth). | Column | Type | Meaning |
|---|---|---|
rank | int64 | — |
code | str | — |
cca3 | str | — |
team | str | — |
p_R32 | float64 | P(reach Round of 32 = clear group stage). |
se_R32 | float64 | — |
p_R16 | float64 | P(reach Round of 16). |
se_R16 | float64 | — |
p_QF | float64 | P(reach quarter-final). |
se_QF | float64 | — |
p_SF | float64 | P(reach semi-final). |
se_SF | float64 | — |
p_Final | float64 | P(reach Final). |
se_Final | float64 | — |
reality_champ | float64 | P(champion), fixture-aware (the LOCKED forecast). = LOCKED p_Champion. |
reality_champ_se | float64 | Monte-Carlo standard error of champion prob. |
conformal_level | str | — |
conformal_coverage_0p90 | float64 | — |
conformal_mean_set_size | float64 | — |
group | str | — |
confederation | str | — |
power_rank | int64 | — |
power_champ | float64 | Fixture-FREE 'Power' champion prob (mean over constrained re-draws). |
power_champ_se | float64 | — |
power_se_mc | float64 | — |
draw_luck | float64 | actual_champ − power_champ; >0 = soft draw, <0 = tough draw. |
draw_luck_se | float64 | — |
bracket_half | int64 | — |
bracket_quadrant | int64 | — |
confed_color | str | — |
is_host | bool | — |
gs_p_pos1 | float64 | — |
gs_p_pos2 | float64 | — |
gs_p_pos3 | float64 | — |
gs_p_pos4 | float64 | — |
gs_p_win_group | float64 | — |
gs_p_top2 | float64 | — |
gs_p_best_third | float64 | — |
gs_p_advance | float64 | — |
gs_exp_points | float64 | — |
gs_mean_gd | float64 | — |
gs_mean_gf | float64 | — |
gs_se_win_group | float64 | — |
gs_se_top2 | float64 | — |
gs_se_best_third | float64 | — |
gs_se_advance | float64 | — |
rankings 48 rows · 35 cols Power vs Reality ranks, champion odds and draw-luck per team. | Column | Type | Meaning |
|---|---|---|
power_rank | int64 | — |
team | str | — |
group | str | — |
confederation | str | — |
actual_champ | float64 | — |
actual_champ_se | float64 | — |
power_champ | float64 | — |
power_se | float64 | — |
power_se_mc | float64 | — |
draw_luck | float64 | — |
draw_luck_se | float64 | — |
bracket_half | int64 | — |
bracket_quadrant | int64 | — |
bracket_half_if_runnerup | int64 | — |
bracket_quadrant_if_runnerup | int64 | — |
power_p_R32 | float64 | — |
power_se_R32 | float64 | — |
actual_p_R32 | float64 | — |
power_p_R16 | float64 | — |
power_se_R16 | float64 | — |
actual_p_R16 | float64 | — |
power_p_QF | float64 | — |
power_se_QF | float64 | — |
actual_p_QF | float64 | — |
power_p_SF | float64 | — |
power_se_SF | float64 | — |
actual_p_SF | float64 | — |
power_p_Final | float64 | — |
power_se_Final | float64 | — |
actual_p_Final | float64 | — |
power_p_Champion | float64 | — |
power_se_Champion | float64 | — |
actual_p_Champion | float64 | — |
code | str | — |
reality_rank | int64 | — |
teams 48 rows · 127 cols The full 48-team panel — forecast + every engineered feature + context, one wide row per nation. | Column | Type | Meaning |
|---|---|---|
rank | int64 | — |
code | str | — |
cca3 | str | — |
team | str | — |
p_R32 | float64 | — |
se_R32 | float64 | — |
p_R16 | float64 | — |
se_R16 | float64 | — |
p_QF | float64 | — |
se_QF | float64 | — |
p_SF | float64 | — |
se_SF | float64 | — |
p_Final | float64 | — |
se_Final | float64 | — |
reality_champ | float64 | — |
reality_champ_se | float64 | — |
conformal_level | str | — |
conformal_coverage_0p90 | float64 | — |
conformal_mean_set_size | float64 | — |
group | str | — |
confederation | str | — |
power_rank | int64 | — |
power_champ | float64 | — |
power_champ_se | float64 | — |
power_se_mc | float64 | — |
draw_luck | float64 | — |
draw_luck_se | float64 | — |
bracket_half | int64 | — |
bracket_quadrant | int64 | — |
confed_color | str | — |
is_host | bool | — |
gs_p_pos1 | float64 | — |
gs_p_pos2 | float64 | — |
gs_p_pos3 | float64 | — |
gs_p_pos4 | float64 | — |
gs_p_win_group | float64 | — |
gs_p_top2 | float64 | — |
gs_p_best_third | float64 | — |
gs_p_advance | float64 | — |
gs_exp_points | float64 | — |
gs_mean_gd | float64 | — |
gs_mean_gf | float64 | — |
gs_se_win_group | float64 | — |
gs_se_top2 | float64 | — |
gs_se_best_third | float64 | — |
gs_se_advance | float64 | — |
elo_now | int64 | — |
elo_change_5y | int64 | — |
elo_change_10y | int64 | — |
elo_slope_10y | float64 | — |
gdp_cagr_5y | float64 | — |
gdp_cagr_10y | float64 | — |
nt_last15_ppg | float64 | — |
nt_last15_gd_pg | float64 | — |
nt_last15_winrate | float64 | — |
nt_form_trend | float64 | — |
squad_form_xgxa_per90 | float64 | — |
squad_form_coverage | int64 | — |
mean_age | float64 | — |
pct_peak_25_29 | float64 | — |
n_over30 | int64 | — |
n_under23 | int64 | — |
mean_caps | float64 | — |
largest_club_bloc | int64 | — |
n_distinct_clubs | int64 | — |
squad_value_eur | float64 | — |
form_vs_value_gap | float64 | — |
elo_vs_value_gap | float64 | — |
value_cagr_8y | float64 | — |
familiarity_index | float64 | — |
shared_club_pairs | int64 | — |
familiarity_weighted_index | float64 | — |
climate_gap | float64 | — |
squad_fitness_readiness | float64 | — |
mean_minutes_per_game | float64 | — |
squad_fitness_global | float64 | — |
fitness_coverage_global | float64 | — |
squad_form_global | float64 | — |
squad_form_global_raw | float64 | — |
mean_league_coef | float64 | — |
form_coverage_global | float64 | — |
g_s_hat | float64 | — |
g_h_hat | float64 | — |
g_mean | float64 | — |
g_sd | float64 | — |
primary_language | str | — |
n_languages | int64 | — |
shares_lang_with_host | bool | — |
region | str | — |
subregion | str | — |
capital_lat | float64 | — |
capital_lon | float64 | — |
home_utc_offset | int64 | — |
gdp_usd | float64 | — |
gdp_per_capita | float64 | — |
population | int64 | — |
dist_nearest_venue_km | int64 | — |
mean_dist_played_km | int64 | — |
max_timezone_shift_h | int64 | — |
max_venue_altitude_m | int64 | — |
max_venue_heat_index_c | float64 | — |
coach | str | — |
coach_nationality | str | — |
appointed | float64 | — |
foreign_coach | bool | — |
n_players | int64 | — |
mean_value_eur | float64 | — |
value_top3_share | float64 | — |
n_in_top5_league | int64 | — |
share_in_top5_league | float64 | — |
n_abroad_in_top5 | int64 | — |
top5_npxg_plus_xa | float64 | — |
top5_minutes | int64 | — |
top5_league_breakdown | str | — |
tm_value_coverage | float64 | — |
n_clustered_players | int64 | — |
n_club_blocs | int64 | — |
top_blocs | str | — |
co_tenure_seasons | int64 | — |
pairs_with_shared_history | int64 | — |
diaspora_usa | float64 | — |
diaspora_can | float64 | — |
diaspora_mex | float64 | — |
diaspora_hosts_total | float64 | — |
diaspora_per_1000_pop | float64 | — |
squad_acclim_tmax | float64 | — |
host_tmax | float64 | — |
players 1,245 rows · 22 cols All 1,245 squad players with their global club season (minutes, goals, league strength). | Column | Type | Meaning |
|---|---|---|
player_id | str | — |
number | int64 | — |
position | str | — |
player | str | — |
dob | str | — |
caps | int64 | — |
nt_goals | int64 | — |
club_squad | str | — |
team | str | — |
group | str | — |
club_apifootball | str | — |
league | str | — |
league_country | str | — |
club_minutes | float64 | — |
club_apps | float64 | — |
club_goals_season | float64 | — |
stats_matched | bool | — |
league_strength_z | float64 | — |
league_log_q | float64 | — |
league_form_coef | float64 | — |
code | str | — |
confederation | str | — |
coaches 48 rows · 8 cols Each nation’s coach, nationality and the foreign-coach flag. | Column | Type | Meaning |
|---|---|---|
team | str | — |
coach | str | — |
coach_nationality | str | — |
appointed | float64 | — |
foreign_coach | bool | — |
code | str | — |
confederation | str | — |
group | str | — |
matches 104 rows · 13 cols All 104 fixtures with venue, round and (where played) score. | Column | Type | Meaning |
|---|---|---|
match_no | int64 | — |
round | int64 | — |
date_utc | str | — |
venue | str | — |
home | str | — |
away | str | — |
group | str | — |
home_score | float64 | — |
away_score | float64 | — |
winner | float64 | — |
stage | str | — |
home_code | str | — |
away_code | str | — |
features 48 rows · 39 cols The 36-column 2026 feature panel the models read (team key + 35 numeric). | Column | Type | Meaning |
|---|---|---|
team | str | — |
elo_now | int64 | Current Elo rating (eloratings.net method, end-2026). |
elo_change_5y | int64 | Elo change over last 5 years. |
elo_change_10y | int64 | Elo change over last 10 years. |
elo_slope_10y | float64 | Linear Elo slope per year over 10y. |
gdp_cagr_5y | float64 | GDP compound annual growth, 5y (World Bank). |
gdp_cagr_10y | float64 | GDP compound annual growth, 10y. |
nt_last15_ppg | float64 | National-team points-per-game, last 15 matches. |
nt_last15_gd_pg | float64 | National-team goal-difference per game, last 15. |
nt_last15_winrate | float64 | National-team win rate, last 15. |
nt_form_trend | float64 | Recent NT form trend (slope). |
squad_form_xgxa_per90 | float64 | Squad club xG+xA per 90 (Understat top-5). |
squad_form_coverage | int64 | Coverage of top-5-league form data (0–1). |
mean_age | float64 | Mean squad age. |
pct_peak_25_29 | float64 | Share of squad aged 25–29 (peak). |
n_over30 | int64 | Players over 30. |
n_under23 | int64 | Players under 23. |
mean_caps | float64 | Mean international caps. |
largest_club_bloc | int64 | Largest same-club bloc size. |
n_distinct_clubs | int64 | Distinct clubs in squad. |
squad_value_eur | float64 | Total squad market value (Transfermarkt, EUR). |
form_vs_value_gap | float64 | Form residual vs market value. |
elo_vs_value_gap | float64 | Elo residual vs market value. |
value_cagr_8y | float64 | Squad value growth, 8y. |
familiarity_index | float64 | Shared-club chemistry index. |
shared_club_pairs | int64 | Count of same-club player pairs. |
familiarity_weighted_index | float64 | Co-tenure-weighted chemistry. |
climate_gap | float64 | Squad acclimatisation vs host max-temp (°C). |
squad_fitness_readiness | float64 | Top-5-league minutes-based readiness. |
mean_minutes_per_game | float64 | Mean club minutes per game. |
squad_fitness_global | float64 | GLOBAL (de-biased) fitness readiness. |
fitness_coverage_global | float64 | Coverage of global fitness data (0–1). |
squad_form_global | float64 | GLOBAL (de-biased) squad club form. |
squad_form_global_raw | float64 | Global form before league normalisation. |
mean_league_coef | float64 | Mean league-strength coefficient of squad's clubs. |
form_coverage_global | float64 | Coverage of global form data (0–1). |
code | str | — |
confederation | str | — |
group | str | — |
features_historical 128 rows · 17 cols The same features rebuilt leakage-safe for 128 past team-tournaments. | Column | Type | Meaning |
|---|---|---|
tournament | str | — |
team | str | — |
elo_asof | float64 | — |
elo_change_5y | float64 | — |
nt_last15_ppg | float64 | — |
nt_last15_gd_pg | float64 | — |
squad_form_xgxa_per90 | float64 | — |
squad_fitness_readiness | float64 | — |
familiarity_index | float64 | — |
familiarity_weighted_index | float64 | — |
understat_coverage | float64 | — |
squad_fitness_global | float64 | — |
fitness_coverage_global | float64 | — |
squad_form_global | float64 | — |
squad_form_global_raw | float64 | — |
mean_league_coef | float64 | — |
form_coverage_global | float64 | — |
model_results 2,128 rows · 14 cols Every out-of-sample match prediction (7 models × W/D/L) — the backtest evidence. | Column | Type | Meaning |
|---|---|---|
protocol | str | — |
fold | str | — |
model | str | — |
match_id | str | — |
home | str | — |
away | str | — |
pH | float64 | — |
pD | float64 | — |
pA | float64 | — |
actual | str | — |
understat_tier | str | — |
subgroup | str | — |
p_over | float64 | — |
total_actual | int64 | — |
collisions 28 rows · 8 cols The top-8 “earliest possible meeting round” pairs from the bracket draw. | Column | Type | Meaning |
|---|---|---|
teamA | str | — |
teamB | str | — |
groupA | str | — |
groupB | str | — |
earliest_round | str | — |
detail | str | — |
codeA | str | — |
codeB | str | — |
league_strength 60 rows · 9 cols The 60-league relative-strength normalisation (England top → Qatar bottom). | Column | Type | Meaning |
|---|---|---|
league_country | str | — |
n_priced | int64 | — |
raw_log_q | float64 | — |
shrunk_log_q | float64 | — |
strength_z | float64 | — |
prior_backed | bool | — |
form_coef | float64 | — |
example_clubs | str | — |
confederation_hint | str | — |
elo_current 336 rows · 3 cols Current world Elo for 336 teams — the reproducible rating spine. | Column | Type | Meaning |
|---|---|---|
team | str | — |
elo | float64 | — |
rank | int64 | — |
Keep reading