Undergraduate Programme and Module Handbook 2022-2023 (archived)
Module MATH2687: Data Science and Statistical Computing II
Department: Mathematical Sciences
MATH2687: Data Science and Statistical Computing II
Type | Open | Level | 2 | Credits | 10 | Availability | Available in 2022/23 | Module Cap | None. | Location | Durham |
---|
Prerequisites
- [Calculus I (Maths Hons) (MATHNEW1) or Calculus 1 (MATH1061) AND Linear Algebra I (Maths Hons) (MATHNEW2) or Linear Algebra 1 (MATH1071) AND Probability I (MATH1597) AND Statistics I (MATH1617)] OR [SMA (MATH1561) AND SMB (MATH1571)]
Corequisites
- None
Excluded Combination of Modules
- None
Aims
- To equip students with the skills to import, explore, manipulate, model and visualise real data sets using the statistical programming language R.
- To introduce students to the concepts and mathematics behind sampling and sampling- based estimators.
- To introduce students to the importance of data protection and governance issues in working with data.
Content
- Modern usage of R.
- Sampling (finite, stratification, clustered, ...).
- Visualization, plotting, exploratory data analysis, data cleaning, reporting tools.
- Data protection / governance.
- Approximating expectations of random variables by Monte Carlo. Accuracy of approximation. Sources of randomness.
- Generating random variables (inverse transform, rejection methods, importance sampling, discrete).
Learning Outcomes
Subject-specific Knowledge:
- By the end of the module students will:
- have a solid foundation in the R programming language;
- be able to import and manipulate real world data sets using modern libraries in the R ecosystem;
- be able to perform an exploratory data analysis including a variety of visualisations;
- understand the mathematics of sampling-based estimators and simple Monte Carlo methods.
Subject-specific Skills:
- Students will have foundational skills in data science, specifically in data import, manipulation and exploration.
- Students will have foundational skills in sampling-based methodology.
Key Skills:
- Students will have basic mathematical skills in the following areas: problem solving, modelling, computation.
Modes of Teaching, Learning and Assessment and how these contribute to the learning outcomes of the module
- Lectures demonstrate what is required to be learned and the application of the theory to practical examples.
- Computer practicals consolidate the studied material, explore theoretical ideas in practice, enhance practical understanding, and develop practical data analysis skills.
- Tutorials provide active problem-solving engagement and immediate feedback to the learning process.
- Assignments for self-study develop problem-solving skills and enable students to test and develop their knowledge and understanding.
- Formative assessments provide feedback to guide students in the correct development of their knowledge and skills in preparation for the summative assessment.
- Computer-based examinations assess the ability to use statistical software and basic programming to solve predictable and unpredictable problems.
- The end-of-year examination assesses the knowledge acquired and the ability to solve predictable and unpredictable problems.
Teaching Methods and Learning Hours
Activity | Number | Frequency | Duration | Total/Hours | |
---|---|---|---|---|---|
Lectures | 21 | Two in weeks: 1-10 and one in week 21 | 1 hour | 21 | |
Tutorials | 6 | Weeks 2, 4, 6, 8, 10, 21 | 1 hour | 6 | ■ |
Computer Practicals | 10 | One in weeks 1-10 | 1 hour11 | 10 | ■ |
Preparation and reading | 63 | ||||
Total | 100 |
Summative Assessment
Component: Examination | Component Weighting: 70% | ||
---|---|---|---|
Element | Length / duration | Element Weighting | Resit Opportunity |
Written Examination | 2 hours | 100% | |
Component: Practical Assessment | Component Weighting: 30% | ||
Element | Length / duration | Element Weighting | Resit Opportunity |
Computer-based examination | 2 hours | 100% |
Formative Assessment:
Weekly written or electronic assignments to be assessed and returned. Other assignments are set for self-study and complete solutions are made available to students.
■ Attendance at all activities marked with this symbol will be monitored. Students who fail to attend these activities, or to complete the summative or formative assessment specified above, will be subject to the procedures defined in the University's General Regulation V, and may be required to leave the University