COURSES >>>


SDSC6009 - Machine Learning at Scale

Offering Academic Unit
Department of Data Science
Credit Units
3
Course Duration
One Semester
Course Offering Term*:
Semester A 2025/26

* The offering term is subject to change without prior notice
 
Course Aims

This course teaches the underlying principles required to develop scalable machine learning pipelines for structured and unstructured data at the petabyte scale. The course covers principles of scaling machine learning process under big data via deploying the MapReduce parallel computing. In addition, the hands-on algorithmic design and development of machine learning algorithms in parallel computing environments (Spark) will be discussed. Students will use MapReduce parallel computing frameworks for machine learning in industrial applications and deployments for various fields, including advertising, finance, healthcare, and search engines.


Assessment (Indicative only, please check the detailed course information)

Continuous Assessment: 65%
Examination: 35%
Examination Duration: 2 hours
Min. Examination Passing Requirement: 30%
 

For a student to pass the course, at least 30% of the maximum mark for the examination should be obtained

 
Detailed Course Information

SDSC6009.pdf