Answer (1 of 8): Thought it might be worthwhile to put all the proposed solutions in context to see when one might use one over the other. The atbrox blog (http://atbrox.com/2010/02/08/parallel-machine-learning-for-hadoopmapreduce-a-python-example/) that Soren points to gives details on map-redu...