Hadoop LastFM AnalysisΒΆ

This module demonstrates how hadoop and WMR can be used to analyze the lastFM million song dataset. It incorporates several advanced hadoop techniques such as job chaining and multiple input. Students should know how to use the WMR hadoop interface before beginning this module.

The dataset was obtained from Columbia University’s LabROSA. However it has been converted into a format that is easier to work with on WMR. The edited dataset is also much smaller since it doesn’t include the audio analysis information. If you would like the smaller dataset for your own WMR cluster please contact JLyman@macalester.edu