ULMS Electronic Module Catalogue

The information contained in this module specification was correct at the time of publication but may be subject to change, either during the session because of unforeseen circumstances, or following review of the module at the end of the session. Queries about the module should be directed to the member of staff with responsibility for the module.
Title Data Mining and Machine Learning
Code EBUS537
Coordinator Professor D Song
Operations and Supply Chain Management
Dongping.Song@liverpool.ac.uk
Year CATS Level Semester CATS Value
Session 2022-23 Level 7 FHEQ First Semester 15

Pre-requisites before taking this module (other modules and/or general educational/academic requirements):

 

Modules for which this module is a pre-requisite:

 

Programme(s) (including Year of Study) to which this module is available on a required basis:

 

Programme(s) (including Year of Study) to which this module is available on an optional basis:

 

Teaching Schedule

  Lectures Seminars Tutorials Lab Practicals Fieldwork Placement Other TOTAL
Study Hours 24

6

      6

36
Timetable (if known) 120 mins X 1 totaling 24
 
60 mins X 1 totaling 6
 
      60 mins X 1 totaling 6
 
 
Private Study 114
TOTAL HOURS 150

Assessment

EXAM Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
             
CONTINUOUS Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
Individual assignment 2. There is a resit opportunity. Standard UoL penalty applies for late submission. This is an anonymous assessment.    50       
Individual assignment 1. There is a resit opportunity. Standard UoL penalty applies for late submission. This is an anonymous assessment.    50       

Aims

To demonstrate in-depth understanding and knowledge of the concepts, theories and developments associated with the subject area, and critically and analytically discuss outcomes in a methodological, structured, logical and in-depth manner;

To demonstrate the ability to apply current tools and techniques of machine learning and data mining in suitable depth and at the appropriate level.


Learning Outcomes

(LO1) Gain an in depth knowledge and principles in the areas of data mining and machine learning;

(LO2) Critically assess the strengths and weaknesses of various data mining and machine learning techniques from a practitioner/ user perspective;

(LO3) Be able to identify, formulate and solve problems arising from practical applications using data mining and machine learning principles and techniques.

(S1) Adaptability
Students will develop adaptability by engaging with case studies and assignments to understand the application of data mining and machine learning techniques.

(S2) Problem solving skills
Students will develop problem solving skills through practicing exercises and undertaking assignments.

(S3) Commercial awareness
Students will develop knowledge of commercial contexts of machine learning techniques and their practical issues.

(S4) Organisation skills
Students will develop time management skills by meeting deadlines of class discussion tasks and assignments.

(S5) Communication skills
Students will develop communication skills by engaging with case studies, report writing and working in groups.

(S6) IT skills
Students will develop IT skills and perform programming exercises during lab sessions.

(S7) International awareness
Students will develop international awareness through case studies of data mining in an international context.

(S8) Lifelong learning skills
Students will develop skills of lifelong learning through self-directed study of cases and reading materials, finding relevant information, and preparation for their assessments.


Teaching and Learning Strategies

2 hour lecture x 12 weeks
1 hour seminar x 6 weeks
1 hour group learning x 6 weeks
114 hours self-directed learning


Syllabus

 

Concepts of data mining, machine learning, supervised learning, unsupervised learning, and Cross-Industry Standard Process Data Mining (CRISP-DM) model.

Data understanding and data preparation; data attributes and data types (nominal, ordinal, interval and ratio data); data quality issues (e.g. missing values, noise, outliers); data pre-processing including aggregation, sampling, feature selection, feature creation and discretisation; basic R programming language.

Data exploration; using R programming tools to explore dataset structure and summary statistics (e.g. median, mean, range and frequency); using R programming tools to visualise data (e.g. boxplot, scatter plot, 2D and 3D histograms) and identify outliers.

Classification modelling methods; induction process to build decision tree; deduction process to apply decision tree; the Hunt’s algorithm; impurity measures (Gini index and entropy); random forest classifier; rule-based classifier; practical issu es of classification (underfitting, overfitting, evaluation); using R programming tools to facilitate building decision trees and calculating impurity measures.

Clustering: types of clusters; types of clustering techniques; applications of clustering; K-means algorithm; methods for evaluating the clustering performance; limitations of K-means; hierarchical clustering algorithm; agglomerative clustering algorithm; approaches to define inter-cluster similarity; using R programming tools for clustering.

Mining association rules: introduction to association rule mining; types of association rules; a two-step approach to mine association rules; the use of Apriori algorithm; computational steps for rule generation; hash tree technique; ,recent applications of association rule mining technique; using R programming tools for association rule learning.

Fuzzy logic technique: the fuzziness of data; fuzzification and membership function determination; composition of fuzzy sets; fuzzy inference engine; defuzzification and rule firing; recent applications of fuzzy logic technique; other integrated-fuzzy approaches.


Recommended Texts

Reading lists are managed at readinglists.liverpool.ac.uk. Click here to access the reading lists for this module.