Postgraduate Programme and Module Handbook 2020-2021 (archived)
Module COMP42315: Programming for Data Science
Department: Computer Science
COMP42315:
Programming for Data Science
Type |
Tied |
Level |
4 |
Credits |
15 |
Availability |
Available in 2020/21 |
Tied to |
G5K823 |
Tied to |
G5K923 |
Prerequisites
Corequisites
Excluded Combination of Modules
Aims
- To provide knowledge of, and the ability to apply, popular Python software packages currently used in industry settings.
- To give students an understanding of how to programmatically gather, manipulate and process real-world data.
- To introduce students to the key concepts of data analysis and data visualisation.
Content
- Programming in Python using popular packages such as Pandas, NumPy, SciPy and Matplotlib.
- Reading, writing and parsing files in different formats.
- Obtaining a data set through the use of web scraping.
- Data munging – cleaning and preparing a dataset for analysis and visualisation.
Learning Outcomes
- By the end of this module, students should:
- Understand advanced concepts of programming in Python.
- Have a critical appreciation of the main strengths and weaknesses of a range of Python packages and understand how to use them.
- Have a critical appreciation of how to acquire and clean datasets for analysis.
- Understand how to manipulate potentially large datasets in an efficient manner.
- By the end of this module, students should:
- Be able to write computer programs in python using industry standard packages.
- Be able to select appropriate data structures for modelling various data science scenarios.
- Be able to select the appropriate algorithm and programming package for a given problem.
- Be able to write a computer program in python to collect or read data from available sources, and clean these datasets using the appropriate packages.
- Effective written communication
- Planning, organising and time-management
- Problem solving and analysis
Modes of Teaching, Learning and Assessment and how these contribute to
the learning outcomes of the module
- This module will be delivered by the Department of Computer Science.
- Learning outputs are met through classroom-based workshops, supported by online resources. The workshops consist of a combination of taught input, case studies, discussion and computing labs. Online resources will typically consist of directed reading and a programming environment with example code.
- The summative assessment is an individual written report on the design, implementation and analysis of a program designed to solve a specific data science problem.
Teaching Methods and Learning Hours
Activity |
Number |
Frequency |
Duration |
Total/Hours |
|
Lectures |
8 |
2 times per week (Term 2, weeks 1-4) |
1 hour |
8 |
|
Workshops |
8 |
2 times per week (Term 2, weeks 1-4) |
2 hours |
16 |
|
Surgery |
12 |
3 times per week (Term 2, weeks 1-4) |
1 hour |
12 |
|
Summative Assessment
Component: Assignment |
Component Weighting: 100% |
Element |
Length / duration |
Element Weighting |
Resit Opportunity |
Individual written assignment based on development of a program |
2000 words |
100% |
|
The formative assessment consists of classroom-based exercises on specific computer science topics, relevant to the learning outcomes of the modules. Oral feedback will be given on a group and/or individual basis as appropriate.
■ Attendance at all activities marked with this symbol will be monitored. Students who fail to attend these activities, or to complete the summative or formative assessment specified above, will be subject to the procedures defined in the University's General Regulation V, and may be required to leave the University