COURSES >>>

SDSC6009 - Machine Learning at Scale

Offering Academic Unit
School of Data Science
Credit Units
3
Course Duration
One Semester
Pre-requisite(s)
Course Offering Term*:
Not offering in current academic year

* The offering term is subject to change without prior notice
 
Course Aims

This course teaches the underlying principles required to develop scalable machine learning pipelines for structured and unstructured data at the petabyte scale. The course covers principles of scaling machine learning process under big data via deploying the MapReduce parallel computing. In addition, the hands-on algorithmic design and development of machine learning algorithms in parallel computing environments (Spark) will be discussed. Students will use MapReduce parallel computing frameworks for machine learning in industrial applications and deployments for various fields, including advertising, finance, healthcare, and search engines.

Assessment (Indicative only, please check the detailed course information)

Continuous Assessment: 65%
Examination: 35%
Examination Duration: 2 hours
 
Detailed Course Information

SDSC6009.pdf

Useful Links

School of Data Science