Module Details

The information contained in this module specification was correct at the time of publication but may be subject to change, either during the session because of unforeseen circumstances, or following review of the module at the end of the session. Queries about the module should be directed to the member of staff with responsibility for the module.
Title BIG DATA ANALYSIS
Code COMP529
Coordinator Dr M Amen
Computer Science
Bakhtiar.Amen@liverpool.ac.uk
Year CATS Level Semester CATS Value
Session 2019-20 M Level First Semester 15

Aims

To introduce the student to middleware often used in Big Data analytics.
To introduce the student to implementing algorithms using such middleware.


Learning Outcomes

(LO1) Understanding of algorithmic approaches for handling batch and streaming analysis.

(LO2) Understanding of middleware that can be used to enable algorithms to scale up to analysis of large datasets.

(LO3) Understanding of the impact of the middleware on how algorithms are articulated.

(S1) Numeracy/computational skills - Reason with numbers/mathematical concepts

(S2) Communication (oral, written and visual) - Following instructions/protocols/procedures


Syllabus

 

Week 1: Introduction to Big Data, motivating real-world applications and assumed dependencies (including discussion on Operating System)  
Week 2: Setting up Middleware for batch analytics with a specific focus on installing Hadoop and running a Map-Reduce job.
Week 3: Introduction to Probabilistic Modelling of large datasets (eg Latent Dirichlet Allocation).
Week 4: Scalable algorithms for analysing large datasets (ie Bayesian Network algorithms).
Week 5: Porting such algorithms to Hadoop.
Week 6: Real-world applications of batch analytics.
Week 7: Setting up Middleware for Streaming Analytics with a specific focus on installing, IBM’s Infosphere Streams and adding a streaming operator.
Week 8: Introduction to Sequential Bayesian Inference.
Week 9: Algorithms for analysing streaming data (eg Kalman filter).
Week 10: Porting such algorithms to Streams.
Week 11: Real-world applications of streaming analytics.
Week 12: Beyond separate batch and streaming analytics.


Teaching and Learning Strategies

Teaching Method 1 - Lecture Description: Teaching Method 2 - Tutorial Description:


Teaching Schedule

  Lectures Seminars Tutorials Lab Practicals Fieldwork Placement Other TOTAL
Study Hours 36

  12

      48
Timetable (if known)              
Private Study 102
TOTAL HOURS 150

Assessment

EXAM Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
Unseen Writting Exam  120  Semester 1  60  Yes  Standard UoL penalty applies   Final Exam Notes ( applying to all assessments) Two assessment tasks (Not marked anonymously, each of which is expected to take approximately 18 hours of work to complete - each involves installing software, writing code and writing repair), as well as writing examination.  
CONTINUOUS Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
Assessment 2 There is a resit opportunity. Standard UoL penalty applies for late submission. This is not an anonymous assessment. Assessment Schedule (When) :Semester 1  36 hours for all CAs    20       
Assessment 1 There is a resit opportunity. Standard UoL penalty applies for late submission. This is not an anonymous assessment. Assessment Schedule (When) :1  36 hours for all CAs    20       

Recommended Texts

Reading lists are managed at readinglists.liverpool.ac.uk. Click here to access the reading lists for this module.