Oxford Certificate Programmes · Worcester College
AI and Deep Learning
Deep learning powers modern artificial intelligence, from image recognition to large language models. The course builds directly on the machine-learning workflow: framing a problem, fitting flexible models, evaluating them honestly, and deploying them responsibly.
Course overview
Students will see how neural networks compose simple, differentiable units into powerful models trained by gradient descent and backpropagation; how convolutional networks learn from images; how the attention mechanism and the transformer architecture lie behind today's large language models; and how retrieval, agents, and AI-assisted coding fit into real systems.
Throughout, the course keeps the discipline of classical machine learning: honest validation, awareness of failure modes and hallucination, and the safety, evaluation, and governance questions that responsible deployment demands. Each idea is built intuition-first and made concrete with a small worked example before any formula.
Learning outcomes
- explain how a neural network composes simple units into a flexible, differentiable model, and how it is trained by gradient descent and backpropagation;
- diagnose and control training (learning rate, regularisation, and overfitting) using held-out validation;
- explain how convolutional networks exploit spatial structure in images, and apply transfer learning and data augmentation without leakage;
- describe the attention mechanism and the transformer architecture behind modern large language models;
- distinguish prompting, retrieval-augmented generation, and fine-tuning, and choose appropriately for a given task;
- evaluate generative systems for factuality, grounding, and failure modes, and recognise when a simpler model is the better choice;
- assess the safety, monitoring, and governance of deployed and agentic AI systems, and use AI coding tools responsibly, with evidence and reproducibility.
Teaching & assessment
Teaching method. Students are taught according to the Oxford Socratic model, where class participation is central. Teaching combines lectures, guided discussion, hands-on Python labs, and group work in and outside class. Each idea is built intuition-first and then made concrete with a small worked example before any formula.
Prerequisites. The course assumes comfort with basic vectors and matrices, derivatives and the chain rule, and elementary probability; some Python familiarity helps with the labs. No previous deep-learning experience is required: it builds on the AI and Machine Learning week, but core ideas are reviewed so that motivated newcomers can follow.
Assessment. Assessment takes place on Friday at the end of the course.
Weekly schedule
| Day | Topic | Focus |
|---|---|---|
| Monday | Neural networks | From logistic regression to multilayer perceptrons; activations and depth; backpropagation and gradient descent; training, regularisation, and honest validation. |
| Tuesday | Computer vision and CNNs | Images as tensors; convolution and pooling; convolutional architectures; transfer learning and data augmentation without leakage. |
| Wednesday | Transformers, LLMs, and retrieval | The attention mechanism and the transformer; pretraining and fine-tuning; prompting; and retrieval-augmented generation (RAG). |
| Thursday | Agentic AI, deployment, and governance | Tool-using agents and AI-assisted coding; evaluating generative systems; and the safety, monitoring, and governance of deployed AI. |
| Friday | Assessment | End-of-course assessment. |
Session overview
Neural Networks
This session builds the neural network from a single neuron (logistic regression with an activation) up to a multilayer perceptron. We cover what depth buys, how backpropagation computes gradients, and how gradient descent trains the model, then return to the familiar discipline of train/validation/test and regularisation to keep flexible models honest.
Key idea: a neural network is a differentiable function fit by gradient descent, and we trust it only if it generalises.
Computer Vision and CNNs
Images become tensors, and we introduce the convolution and pooling operations that let a network exploit spatial structure. We discuss convolutional architectures, why transfer learning works, and how to augment data without letting validation or test images leak into training.
Key idea: a CNN sees an image by sliding the same small pattern-detector everywhere, reusing weights instead of relearning them.
Transformers, LLMs, and Retrieval
This session introduces the attention mechanism and the transformer architecture behind modern large language models. We compare prompting, retrieval-augmented generation, and fine-tuning, and discuss why grounding through retrieval matters when factuality is at stake.
Key idea: attention lets every word look at every other word, and a language model is next-word prediction at scale: fluent, but not necessarily right.
Agentic AI, Deployment, and Governance
The final session covers tool-using agents and AI-assisted coding, how to evaluate generative systems for grounding and failure, and the safety, monitoring, and governance questions, from drift to accountability, that responsible deployment of AI systems demands.
Key idea: an AI system is a model plus tools, memory, and guardrails: the more it can do, the more it must be evaluated and governed.
Core bibliography & reading list
All items below are freely and publicly available online.
- Prince, Simon J. D. Understanding Deep Learning. MIT Press, 2023. udlbook.github.io
- Zhang, Aston, Zachary C. Lipton, Mu Li, and Alexander J. Smola. Dive into Deep Learning. Cambridge University Press, 2023. d2l.ai
- Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. deeplearningbook.org
- Deisenroth, Marc Peter, A. Aldo Faisal, and Cheng Soon Ong. Mathematics for Machine Learning. Cambridge University Press, 2020. mml-book.github.io
- Sanderson, Grant (3Blue1Brown). Neural Networks (visual video series). 3blue1brown.com
- Stanford CS231n. Deep Learning for Computer Vision (course notes). cs231n.github.io
- Jurafsky, Dan, and James H. Martin. Speech and Language Processing. 3rd ed. (draft), covering transformers and large language models. web.stanford.edu/~jurafsky/slp3
- Alammar, Jay. The Illustrated Transformer. jalammar.github.io
- Google. Machine Learning Crash Course (neural networks, embeddings, and LLM modules). developers.google.com
- National Institute of Standards and Technology (NIST). Artificial Intelligence Risk Management Framework (AI RMF 1.0). 2023. nist.gov