Durham University
Programme and Module Handbook

Postgraduate Programme and Module Handbook 2025-2026

Module COMP54315: Reinforcement Learning

Department: Computer Science

COMP54315: Reinforcement Learning

Type Tied Level 5 Credits 15 Availability Available in 2025/2026 Module Cap
Tied to G5T609
Tied to G5T709

Prerequisites

  • None

Corequisites

  • None

Excluded Combination of Modules

  • None

Aims

  • To give students an in-depth understanding of theoretical and algorithmic foundations of reinforcement learning and their applications.
  • To implement and experiment with a range of different deep reinforcement learning algorithms and learn how to visualise and evaluate their performance.

Content

  • Introduction and foundations of reinforcement learning.
  • Bandits and (fully/partially observable) Markov decision processes.
  • Computing values via dynamic programming.
  • Monte Carlo methods for value approximation and planning.
  • On-policy, off-policy, and temporal difference learning.
  • Model-based and model-free prediction and control.
  • Function approximation and policy gradient methods.
  • Scaling up with deep reinforcement learning.
  • Integrating learning and planning; exploration/exploitation trade-off.

Learning Outcomes

Subject-specific Knowledge:
  • By the end of this module, students should be able to demonstrate:
  • a critical understanding of the key features of reinforcement learning and differences with non-interactive learning.
  • a critical understanding of state-of-the-art reinforcement learning algorithms.
  • an understanding of the issues faced in scaling reinforcement learning approaches using deep learning.
Subject-specific Skills:
  • By the end of this module, students should be able to demonstrate:
  • an ability to use modern libraries to design, train, validate and test deep reinforcement learning models.
  • an ability to find RL based solutions with respect to the task or environment.
  • an ability to design bespoke RL algorithms based on the problem and the environment, such as whether in continuous or discrete action spaces.
  • an ability to solve complex learning and planning problems in dynamic environments.
Key Skills:
  • By the end of this module, students should be able to demonstrate:
  • the scientific approach to the design, training, validation, and testing of reinforcement techniques in a broad range of applications.
  • an ability to design new environments and design tailored agents that learn to control the environments.
  • an ability to identify the problem area and subsequently design and implement state-of-the-art reinforcement learning approaches.

Modes of Teaching, Learning and Assessment and how these contribute to the learning outcomes of the module

  • Lectures enable the students to learn new material relevant to reinforcement learning, as well as its applications.
  • Computer classes enable students to acquire the necessary coding skills, deepen their understanding of the material from the lectures, learn about the relevant libraries and packages and receive feedback on their work.
  • Summative assessments assess the knowledge of relevant libraries and application of methods and techniques.
  • The exercise element of the coursework component consists of an assessment of the setup for the assignment.
  • The assignment element of the coursework component consists of a coding exercise with accompanying report.

Teaching Methods and Learning Hours

Activity Number Frequency Duration Total/Hours
Lectures 10 1 per week (2 in weeks 1 and 2) 1 hour 10
Computer Classes 8 1 per week 2 hours 16
Independent Study 124
Total 150

Summative Assessment

Component: Coursework Component Weighting: 100%
Element Length / duration Element Weighting Resit Opportunity
Exercise 10%
Assignment 90%

Formative Assessment:

Via computer classes.


Attendance at all activities marked with this symbol will be monitored. Students who fail to attend these activities, or to complete the summative or formative assessment specified above, will be subject to the procedures defined in the University's General Regulation V, and may be required to leave the University