This talk was given at Midwest.io 2014. Cloudera's Data Science Team has a simple mission: build an analytics infrastructure so awesome that it makes Google's Ads Quality Team seethe with jealousy. To that end, I'll give an overview of Cloudera's current data science tools, including Oryx and Spark for building and serving machine learning models, Gertrude for multivariate testing, and Impala for