Durham University
Programme and Module Handbook

Postgraduate Programme and Module Handbook 2020-2021 (archived)

Module COMP42315: Programming for Data Science

Department: Computer Science

COMP42315: Programming for Data Science

Type Tied Level 4 Credits 15 Availability Available in 2020/21 Module Cap None.
Tied to G5K823
Tied to G5K923


  • None


  • None

Excluded Combination of Modules

  • None


  • To provide knowledge of, and the ability to apply, popular Python software packages currently used in industry settings.
  • To give students an understanding of how to programmatically gather, manipulate and process real-world data.
  • To introduce students to the key concepts of data analysis and data visualisation.


  • Programming in Python using popular packages such as Pandas, NumPy, SciPy and Matplotlib.
  • Reading, writing and parsing files in different formats.
  • Obtaining a data set through the use of web scraping.
  • Data munging – cleaning and preparing a dataset for analysis and visualisation.

Learning Outcomes

Subject-specific Knowledge:
  • By the end of this module, students should:
  • Understand advanced concepts of programming in Python.
  • Have a critical appreciation of the main strengths and weaknesses of a range of Python packages and understand how to use them.
  • Have a critical appreciation of how to acquire and clean datasets for analysis.
  • Understand how to manipulate potentially large datasets in an efficient manner.
Subject-specific Skills:
  • By the end of this module, students should:
  • Be able to write computer programs in python using industry standard packages.
  • Be able to select appropriate data structures for modelling various data science scenarios.
  • Be able to select the appropriate algorithm and programming package for a given problem.
  • Be able to write a computer program in python to collect or read data from available sources, and clean these datasets using the appropriate packages.
Key Skills:
  • Effective written communication
  • Planning, organising and time-management
  • Problem solving and analysis

Modes of Teaching, Learning and Assessment and how these contribute to the learning outcomes of the module

  • This module will be delivered by the Department of Computer Science.
  • Learning outputs are met through classroom-based workshops, supported by online resources. The workshops consist of a combination of taught input, case studies, discussion and computing labs. Online resources will typically consist of directed reading and a programming environment with example code.
  • The summative assessment is an individual written report on the design, implementation and analysis of a program designed to solve a specific data science problem.

Teaching Methods and Learning Hours

Activity Number Frequency Duration Total/Hours
Lectures 8 2 times per week (Term 2, weeks 1-4) 1 hour 8
Workshops 8 2 times per week (Term 2, weeks 1-4) 2 hours 16
Surgery 12 3 times per week (Term 2, weeks 1-4) 1 hour 12

Summative Assessment

Component: Assignment Component Weighting: 100%
Element Length / duration Element Weighting Resit Opportunity
Individual written assignment based on development of a program 2000 words 100%

Formative Assessment:

The formative assessment consists of classroom-based exercises on specific computer science topics, relevant to the learning outcomes of the modules. Oral feedback will be given on a group and/or individual basis as appropriate.

Attendance at all activities marked with this symbol will be monitored. Students who fail to attend these activities, or to complete the summative or formative assessment specified above, will be subject to the procedures defined in the University's General Regulation V, and may be required to leave the University