Data science is a fast-changing discipline due to the rapid increase in the amount of data available for analysis.
Every year, research labs compete to produce yet more scalable and more efficient data processing tools and techniques. This creates a gap between what is being taught in undergraduate programs and what is currently being used in the industry and researcher labs. This project is aimed at bridging this gap by enabling students to work on real-world data problems using modern large-scale data processing tools. Participants are expected to gain proficiency with the topic they choose to investigate in depth.