Oxford Certificate Programmes · Worcester College

AI and Machine Learning

Machine learning lets computers find patterns in data and turn them into predictions and decisions. The course follows the full workflow: framing a decision, preparing data, fitting models, evaluating them honestly, and using them responsibly.

InstructorDr Fatih Kansoy

SessionsSummer I, II & III

Dates19 Jul – 29 Aug 2026

Course weekWeek One

LocationWorcester College, Oxford

FormatLectures, seminars & Python labs

LengthTwo-week programme

AssessmentFriday assessment

Course overview

Students will see how supervised models (linear and logistic regression, decision trees, random forests, and boosting) are built and judged; how unsupervised methods such as clustering and principal component analysis reveal structure without labels; and how modern AI systems (embeddings, retrieval, and generative tools) sit alongside classical methods.

Throughout, the course emphasises honest evaluation, the danger of data leakage, the difference between prediction and causation, and the fairness, interpretability, and governance questions that responsible deployment requires. Through worked examples, Python labs, and the discussion of real decisions, students develop the analytical tools to choose appropriate models, evaluate them with the right metrics, audit them for leakage and bias, and communicate both their conclusions and their limitations clearly.

Learning outcomes

frame a real decision as a supervised learning problem, and identify which variables are genuinely available before the decision is made;
explain and apply the core supervised models (linear and logistic regression, decision trees, random forests, and boosting) and the bias–variance trade-off that governs them;
evaluate models with metrics matched to the decision (accuracy, precision/recall, ROC–AUC, calibration) using train/validation/test splits and cross-validation;
diagnose data leakage, and distinguish predictive association from causal effect;
use unsupervised methods (K-means clustering and principal component analysis) to find and interpret structure in unlabelled data;
explain how embeddings, retrieval-augmented generation, and other modern AI workflows relate to classical machine learning;
assess the fairness, interpretability, monitoring, and governance of a deployed model, and use AI tools responsibly, with evidence and reproducibility.

Teaching & assessment

Teaching method. Students are taught according to the Oxford Socratic model, where class participation is central. Teaching combines lectures, guided discussion, hands-on Python labs, and group work in and outside class. No prior programming experience is assumed, though it is welcome.

Assessment. Assessment takes place on Friday at the end of the course.

Weekly schedule

Day	Topic	Focus
Monday	Foundations: data, prediction, and trust	What machine learning is and is not; features and targets; loss; the train/test split; metrics; leakage; and why prediction is not causation.
Tuesday	Regression and classification: from models to decisions	Linear and logistic regression; coefficients and uncertainty; turning predicted probabilities into actions with cost-based thresholds.
Wednesday	Flexible models and honest evaluation	Decision trees, random forests, and boosting; cross-validation; precision, recall, ROC and PR curves; and auditing a model for data leakage.
Thursday	Unsupervised learning, modern AI, and responsible use	K-means clustering and PCA; embeddings and retrieval; and fairness, interpretability, monitoring, and governance.
Friday	Assessment	End-of-course assessment.

Session overview

Session 1

Foundations of Machine Learning

This session sets up the workflow we reuse all week: turning a decision into a prediction problem, separating signal from noise, splitting data honestly, and choosing a loss. We stress what is known before a decision is made, and why data leakage and the prediction–causation gap matter from the very start.

Session 2

Regression and Classification

We turn to the baseline supervised models: linear regression for numbers and logistic regression for probabilities. We discuss coefficients, uncertainty, and how a predicted probability becomes an action through a cost-based threshold rather than a default cut-off.

Session 3

Flexible Models and Honest Evaluation

This session introduces decision trees, random forests, and boosting, together with the bias–variance trade-off that controls overfitting. The added flexibility is paired with stricter evaluation: cross-validation, the right metric for imbalanced problems, and a disciplined leakage audit.

Session 4

Unsupervised Learning and Responsible AI

The final session moves from prediction to structure discovery with K-means clustering and PCA, connects classical tools to modern AI through embeddings and retrieval, and closes with the fairness, interpretability, monitoring, and governance questions that responsible deployment demands.

Core bibliography & reading list

All items below are freely and publicly available online.

James, Gareth, Daniela Witten, Trevor Hastie, Robert Tibshirani, and Jonathan Taylor. An Introduction to Statistical Learning with Applications in Python. Springer, 2023. statlearning.com
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning. 2nd ed. Springer, 2009. hastie.su.domains
Deisenroth, Marc Peter, A. Aldo Faisal, and Cheng Soon Ong. Mathematics for Machine Learning. Cambridge University Press, 2020. mml-book.github.io
scikit-learn developers. scikit-learn User Guide. scikit-learn.org
VanderPlas, Jake. Python Data Science Handbook. 2nd ed. O'Reilly, 2022. jakevdp.github.io
Google. Machine Learning Crash Course. developers.google.com
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. deeplearningbook.org
Sanderson, Grant (3Blue1Brown). Neural Networks (visual video series). 3blue1brown.com
Molnar, Christoph. Interpretable Machine Learning. 2nd ed., 2022. christophm.github.io
National Institute of Standards and Technology (NIST). Artificial Intelligence Risk Management Framework (AI RMF 1.0). 2023. nist.gov

Oxford · United Kingdom Teaching CV