Parquet is a columnar storage format for Hadoop data. It was developed by Twitter and Cloudera to optimize storage and querying of large datasets. Parquet provides more efficient compression and I/O compared to traditional row-based formats by storing data by column. Early results show a 28% reduction in storage size and up to a 114% improvement in query performance versus the original Thrift form
![Parquet Hadoop Summit 2013](https://cdn-ak-scissors.b.st-hatena.com/image/square/4710befbdbaf2a6938dfd2dbc3288c2dc0176dcc/height=288;version=1;width=512/https%3A%2F%2Fcdn.slidesharecdn.com%2Fss_thumbnails%2Fparquethadoopsummit2013-130627111442-phpapp01-thumbnail.jpg%3Fwidth%3D640%26height%3D640%26fit%3Dbounds)