2. Outline ‣ Problem Statement ‣ CSV? XML? JSON? Regex? ‣ Protocol Buffers ‣ Codegen, Hadoop and You ‣ Applications ‣ Conclusions and Next Steps 3. My Background ‣ Studied Mathematics and Physics at Harvard, Physics at Stanford ‣ Tropos Networks (city-wide wireless): mesh routing algorithms, GBs of data ‣ Cooliris (web media): Hadoop and Pig for analytics, TBs of data ‣ Twitter: Hadoop, Pig, HBase