Expanding window trains only on tournaments strictly before
the one being scored, mirroring how the model is actually used pre-tournament;
it is the number we quote as expected skill. Leave-one-tournament-out
(LOTO) lets the model see future tournaments when scoring a past one — it is
optimistic and we report it only as an upper bound. The ensemble's LOTO RPS
(0.1830) is better than its expanding (0.1891),
exactly the gap you would expect; we never headline the LOTO figure.
The intervals are deliberately conservative: for each model we take the
wider of a tournament-block bootstrap (resampling whole tournaments,
B = 3000) and a leave-one-tournament jackknife. Blocking by tournament keeps
matches from the same event together, so the CI reflects the true unit of
replication — the tournament, of which there are three.