ULMS Electronic Module Catalogue |
The information contained in this module specification was correct at the time of publication but may be subject to change, either during the session because of unforeseen circumstances, or following review of the module at the end of the session. Queries about the module should be directed to the member of staff with responsibility for the module. |
Title | Data Mining and Machine Learning | ||
Code | EBUS537 | ||
Coordinator |
Professor D Song Operations and Supply Chain Management Dongping.Song@liverpool.ac.uk |
||
Year | CATS Level | Semester | CATS Value |
Session 2022-23 | Level 7 FHEQ | First Semester | 15 |
Pre-requisites before taking this module (other modules and/or general educational/academic requirements): |
Modules for which this module is a pre-requisite: |
Programme(s) (including Year of Study) to which this module is available on a required basis: |
Programme(s) (including Year of Study) to which this module is available on an optional basis: |
Teaching Schedule |
Lectures | Seminars | Tutorials | Lab Practicals | Fieldwork Placement | Other | TOTAL | |
Study Hours |
24 |
6 |
6 |
36 | |||
Timetable (if known) |
120 mins X 1 totaling 24
|
60 mins X 1 totaling 6
|
60 mins X 1 totaling 6
|
||||
Private Study | 114 | ||||||
TOTAL HOURS | 150 |
Assessment |
||||||
EXAM | Duration | Timing (Semester) |
% of final mark |
Resit/resubmission opportunity |
Penalty for late submission |
Notes |
CONTINUOUS | Duration | Timing (Semester) |
% of final mark |
Resit/resubmission opportunity |
Penalty for late submission |
Notes |
Individual assignment 2. There is a resit opportunity. Standard UoL penalty applies for late submission. This is an anonymous assessment. | 0 | 50 | ||||
Individual assignment 1. There is a resit opportunity. Standard UoL penalty applies for late submission. This is an anonymous assessment. | 0 | 50 |
Aims |
|
To demonstrate in-depth understanding and knowledge of the concepts, theories and developments associated with the subject area, and critically and analytically discuss outcomes in a methodological, structured, logical and in-depth manner; To demonstrate the ability to apply current tools and techniques of machine learning and data mining in suitable depth and at the appropriate level. |
Learning Outcomes |
|
(LO1) Gain an in depth knowledge and principles in the areas of data mining and machine learning; |
|
(LO2) Critically assess the strengths and weaknesses of various data mining and machine learning techniques from a practitioner/ user perspective; |
|
(LO3) Be able to identify, formulate and solve problems arising from practical applications using data mining and machine learning principles and techniques. |
|
(S1) Adaptability |
|
(S2) Problem solving skills |
|
(S3) Commercial awareness |
|
(S4) Organisation skills |
|
(S5) Communication skills |
|
(S6) IT skills |
|
(S7) International awareness |
|
(S8) Lifelong learning skills |
Teaching and Learning Strategies |
|
2 hour lecture x 12 weeks |
Syllabus |
|
Concepts of data mining, machine learning, supervised learning, unsupervised learning, and Cross-Industry Standard Process Data Mining (CRISP-DM) model. Data understanding and data preparation; data attributes and data types (nominal, ordinal, interval and ratio data); data quality issues (e.g. missing values, noise, outliers); data pre-processing including aggregation, sampling, feature selection, feature creation and discretisation; basic R programming language. Data exploration; using R programming tools to explore dataset structure and summary statistics (e.g. median, mean, range and frequency); using R programming tools to visualise data (e.g. boxplot, scatter plot, 2D and 3D histograms) and identify outliers. Classification modelling methods; induction process to build decision tree; deduction process to apply decision tree; the Hunt’s algorithm; impurity measures (Gini index and entropy); random forest classifier; rule-based classifier; practical issu es of classification (underfitting, overfitting, evaluation); using R programming tools to facilitate building decision trees and calculating impurity measures. Clustering: types of clusters; types of clustering techniques; applications of clustering; K-means algorithm; methods for evaluating the clustering performance; limitations of K-means; hierarchical clustering algorithm; agglomerative clustering algorithm; approaches to define inter-cluster similarity; using R programming tools for clustering. Mining association rules: introduction to association rule mining; types of association rules; a two-step approach to mine association rules; the use of Apriori algorithm; computational steps for rule generation; hash tree technique; ,recent applications of association rule mining technique; using R programming tools for association rule learning. Fuzzy logic technique: the fuzziness of data; fuzzification and membership function determination; composition of fuzzy sets; fuzzy inference engine; defuzzification and rule firing; recent applications of fuzzy logic technique; other integrated-fuzzy approaches. |
Recommended Texts |
|
Reading lists are managed at readinglists.liverpool.ac.uk. Click here to access the reading lists for this module. |