Introduction It’s been a while since Google introduced their MapReduce framework for distributed computing on large data sets on clusters of computers. Paradigma Labs was thinking of trying Apache Hadoop to run this kind of tasks, so it was the proper choice to run some web scraping we have in our hands. I was the designated developer for the task, and my love for Ruby on Rails led me to give Hado