Hadoop Network Analysis

This module was created for CSInParallel by Jeffrey Lyman in 2014 (JLyman@macalester.edu)

The purpose of this module is to teach students how to analyze networks and datasets distributed over multiple files using the Hadoop framework. It is assumed that students are already familiar with the basics of hadoop and CSInParallel’s Web Map Reduce (WMR) hadoop interface.

The exercises in this module use a network of friendships on the social movie recommendation site Flixster. Students will use it to learn how to analyze networks and chain jobs.

The dataset can be downloaded from Academic Torrents