MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat Abstract MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with t