Module Details |
The information contained in this module specification was correct at the time of publication but may be subject to change, either during the session because of unforeseen circumstances, or following review of the module at the end of the session. Queries about the module should be directed to the member of staff with responsibility for the module. |
Title | INTRODUCTION TO DATA SCIENCE | ||
Code | COMP229 | ||
Coordinator |
Dr V Kurlin Computer Science Vitaliy.Kurlin@liverpool.ac.uk |
||
Year | CATS Level | Semester | CATS Value |
Session 2019-20 | Level 4 FHEQ | First Semester | 15 |
Aims |
|
1. To provide a foundation and overview of modern problems in Data Science. 2. To describe the tools and approaches for the design and analysis of algorithms for da-ta clustering, dimensionally reduction, graph reconstruction from noisy data. 3. To discuss the effectiveness and complexity of modern Data Science algorithms. 4. To review applications of Data Science to Vision, Networks, Materials Chemistry. |
Learning Outcomes |
|
(LO1) describe modern problems and tools in data clustering and dimensionality reduction, |
|
(LO2) formulate a real data problem in a rigorous form and suggest potential solutions, |
|
(LO3) choose the most suitable approach or algorithmic method for given real-life data, |
|
(LO4) visualise high-dimensional data and extract hidden non-linear patterns from the data. |
|
(S1) Critical thinking and problem solving - Critical analysis |
Syllabus |
|
1. Metric Geometry (6 lectures): point clouds, distance functions, metric spaces, isometries and invariants, equivalence of point clouds up to linear transformations. 2. Clustering methods (6 lectures): graphs and trees, a minimum spanning tree, union-find algorithm, clustering based on connectivity, centroids, densities and distributions. 3. Computational Geometry (6 lectures): Voronoi decompositions, alpha-complexes, the Reeb graph, the Mapper algorithm, the graph reconstruction problem from noisy data. 4. Dimensionality reduction (6 lectures): linear operators, eigenvectors and eigenvectors, Principal Component Analysis (PCA) and Singular-Value Decomposition (SVD). 5. Geometric Data Analysis (6 lectures): graph Laplacians in spectral graph theory, graph partitioning algorithms, connectivity of networks, shape descriptors. |
Teaching and Learning Strategies |
|
Teaching Method 1 - Lecture Description: Formal Lectures Teaching Method 2 - Tutorial Description: Tutorials with 4-5 formative assessments (marked by demonstrators) - using problems similar to exam questions. |
Teaching Schedule |
Lectures | Seminars | Tutorials | Lab Practicals | Fieldwork Placement | Other | TOTAL | |
Study Hours |
30 |
10 |
40 | ||||
Timetable (if known) | |||||||
Private Study | 110 | ||||||
TOTAL HOURS | 150 |
Assessment |
||||||
EXAM | Duration | Timing (Semester) |
% of final mark |
Resit/resubmission opportunity |
Penalty for late submission |
Notes |
CONTINUOUS | Duration | Timing (Semester) |
% of final mark |
Resit/resubmission opportunity |
Penalty for late submission |
Notes |
Recommended Texts |
|
Reading lists are managed at readinglists.liverpool.ac.uk. Click here to access the reading lists for this module. |