Anand Rajaraman and Jeffrey D. Ullman have put together a new ebook, Mining of Massive Datasets. The book builds on the course materials for the Stanford CS345 course "Web Mining" and the CS246 class, Mining Massive Data Sets. An introduction to data miningLarge-scale processing with distributed file systems and MapReduceSimilarity search: nearest neighbor, minhashing, LSH, etc...Algorithms for mi