Module Details

The information contained in this module specification was correct at the time of publication but may be subject to change, either during the session because of unforeseen circumstances, or following review of the module at the end of the session. Queries about the module should be directed to the member of staff with responsibility for the module.
Title Big Data Analysis
Code COMP529
Coordinator Prof S Maskell
School of Electrical Engineering, Electronics and Computer Science
S.Maskell@liverpool.ac.uk
Year CATS Level Semester CATS Value
Session 2017-18 Level 7 FHEQ First Semester 15

Aims

  • To introduce the student to middleware often used in Big Data analytics.

  • To introduce the student to implementing algorithms using such middleware.


  • Learning Outcomes

    Understanding of algorithmic approaches for handling batch and streaming analysis.

    Understanding of middleware that can be used to enable algorithms to scale up to analysis of large datasets.

    Understanding of the impact of the middleware on how algorithms are articulated.


    Syllabus

    Week 1: Introduction to Big Data, motivating real-world applications and assumed dependencies (including discussion on Operating System)

     

    Week 2: Setting up Middleware for batch analytics with a specific focus on installing Hadoop and running a Map-Reduce j ob.

    Week 3: Introduction to Probabilistic Modelling of large datasets (eg Latent Dirichlet Allocation).

    Week 4: Scalable algorithms for analysing large datasets (ie Bayesian Network algorithms).

    Week 5: Porting such algorithms to Hadoop.

    Week 6: Real-world applications of batch analytics.

    Week 7: Setting up Middleware for Streaming Analytics with a specific focus on installing, IBM’s Infosphere Streams and adding a streaming operator.

    Week 8: Introduction to Sequential Bayesian Inference.

    Week 9: Algorithms for analysing streaming data (eg Kalman filter).

    Week 10: Porting such algorithms to Streams.

    Week 11: Real-world applications of streaming analytics.


    Week 12: Beyond separate batch and streaming analytics.



    Teaching and Learning Strategies

    Lecture -

    Tutorial -


    Teaching Schedule

      Lectures Seminars Tutorials Lab Practicals Fieldwork Placement Other TOTAL
    Study Hours 36

      12

          48
    Timetable (if known)              
    Private Study 102
    TOTAL HOURS 150

    Assessment

    EXAM Duration Timing
    (Semester)
    % of
    final
    mark
    Resit/resubmission
    opportunity
    Penalty for late
    submission
    Notes
    Unseen Written Exam  120  Semester 1  60  Yes  Standard UoL penalty applies  Final Exam Notes (applying to all assessments) Two assessment tasks (Not marked anonymously, each of which is expected to take approximately 18 hours of work to complete - each involves installing software, writing code and writing a report). Written examination  
    CONTINUOUS Duration Timing
    (Semester)
    % of
    final
    mark
    Resit/resubmission
    opportunity
    Penalty for late
    submission
    Notes
    Coursework  36 hours for all CAs  20  Yes  Standard UoL penalty applies  Assessment 1 
    Coursework  36 hours for all CAs  Semester 1  20  Yes  Standard UoL penalty applies  Assessment 2 

    Recommended Texts

    Reading lists are managed at readinglists.liverpool.ac.uk. Click here to access the reading lists for this module.
    Explanation of Reading List: