Million Song Dataset Analytics
PySpark, MLlib, Tensorflow, AWS
Instructors: Virginia Smith and Ameet Talwalkar
Project summary
- Carried out feature engineering and pre-processing on AWS EC2 parallelly with PySpark
- Modeled the relationship between artist familiarity and the features with various MLlib tools and Tensorflow
- Predicted artist familiarity with the pipeline, visualized and analyzed the results for business insights