ULMS Electronic Module Catalogue

The information contained in this module specification was correct at the time of publication but may be subject to change, either during the session because of unforeseen circumstances, or following review of the module at the end of the session. Queries about the module should be directed to the member of staff with responsibility for the module.
Title Machine Learning and Big Data Econometrics
Code ECON701
Coordinator Dr GD Liu-Evans
Economics
Gareth.Liu-Evans@liverpool.ac.uk
Year CATS Level Semester CATS Value
Session 2022-23 Level 7 FHEQ Second Semester 15

Pre-requisites before taking this module (other modules and/or general educational/academic requirements):

ECON814 ECONOMETRIC AND STATISTICAL METHODS 

Modules for which this module is a pre-requisite:

 

Programme(s) (including Year of Study) to which this module is available on a required basis:

 

Programme(s) (including Year of Study) to which this module is available on an optional basis:

 

Teaching Schedule

  Lectures Seminars Tutorials Lab Practicals Fieldwork Placement Other TOTAL
Study Hours 24

6

      6

36
Timetable (if known) 120 mins X 1 totaling 24
 
60 mins X 1 totaling 6
 
      60 mins X 1 totaling 6
 
 
Private Study 114
TOTAL HOURS 150

Assessment

EXAM Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
Examination Reassessment Opportunity: Yes Penalty for Late Submission: Standard UoL penalty applies Anonymous Assessment: Yes    50       
CONTINUOUS Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
Individual Data Analysis Report Reassessment Opportunity: Yes Penalty for Late Submission: Standard UoL penalty applies Anonymous Assessment: Yes    50       

Aims

The module aims to prepare students for careers where a good understanding of Machine Learning methods and Python programming is necessary or advantageous. Examples include: research careers in applied economics or finance, careers in data science, or careers in data analysis generally.


Learning Outcomes

(LO1) Students will be able to define, explain and motivate a number of Machine Learning methods.

(LO2) Students will be able to use libraries in Python for Machine Learning and scientific research.

(LO3) Students will be able to produce Jupyter Notebook documents, mixing formatted text in Markdown with Python code.

(LO4) Students will gain a good general ability with the Python programming language.

(S1) Flexibility and adaptability
Students will need to learn in various directions for this module, using a variety of different resources for learning the underlying Machine Learning methodology and the programming, and this will require a degree of adaptability.

(S2) Problem solving
The module involves programming exercises.

(S3) Numeracy
The module involves a substantial amount of data analysis.

(S4) Commercial awareness
Commercial uses of the Machine Learning methodologies will be described in class. The Python libraries used in the module are widely used in the commercial world.


Teaching and Learning Strategies

2 hour lecture x 12 weeks
1 hour seminar x 6 weeks
1 hour group learning x 6 weeks
114 hours self-directed learning

Students will need to spend time studying the machine learning methodology, along with the code examples and online documentation. Significant time will also need to be spent coding in Python, in order to gain confidence with the facilities and in order to complete the coursework assessment.


Syllabus

 

Main topics:

Least Squares review, and Subset Selection methods;

Principal Components regression;

Lasso and related methods;

Regression trees and random forests;

General principles for building predictive Machine Learning models;

Econometric inference in high dimensional models;

Deep Learning – several sessions covering algorithmic details (backpropagation and gradient descent, stochastic gradient descent, regularisation), and the use of Python libraries for deep learning;

Applications of Big Data in Economics.

Python libraries (indicative examples): numpy, pandas, scikit-learn, pytorch/keras, nltk. Programming environments: Jupyter Notebook, Spyder. Students are encouraged to install the Anaconda distribution of Python 3, which is freely available, as this comes with Jupyter Notebook, Spyder, and many of the required libraries.

Learning resources will be available on Canvas, e.g. Jupyter Notebooks containing code from lectures and lab sessions. Participants are encouraged to read the recommended textbooks for further detail about the Machine Learning methodologies and their application in Python, and references will be given in class. There are many good textbook treatments of the topics covered in the module.

The websites for the Python packages and libraries generally have introductory examples, tutorials and documentation, and participants on the module are encouraged to work through these. The R language will be introduced and used for one topic, where the package hdm will be used.


Recommended Texts

Reading lists are managed at readinglists.liverpool.ac.uk. Click here to access the reading lists for this module.