タグ

ブックマーク / www.chrisstucchio.com (1)

  • Don't use Hadoop - your data isn't that big

    "So, how much experience do you have with Big Data and Hadoop?" they asked me. I told them that I use Hadoop all the time, but rarely for jobs larger than a few TB. I'm basically a big data neophite - I know the concepts, I've written code, but never at scale. The next question they asked me. "Could you use Hadoop to do a simple group by and sum?" Of course I could, and I just told them I needed t

    yass
    yass 2013/09/18
    " If you have a single table containing many terabytes of data, Hadoop might be a good option for running full table scans on it. If you don’t have such a table, avoid Hadoop like the plague. / Hadoop does not have any conception of indexing. Hadoop has only full table scans. "
  • 1