Home > Hadoop, java > TF-IDF in Hadoop Part 1: Word Frequency in Doc My interest about parallel computing dates since my undergraduate school, just one or two years after Google’s paper was published about how to make efficient data processing. From that time on, I was wondering how they manage to index “the web”. As I started learning the API and the HDFS, as well as exploring the implementation