Introduction Cloudera Impala supports low-latency, interactive queries on Hadoop data sets either stored in Hadoop Distributed File System (HDFS) or HBase, the distributed NoSQL database for Hadoop. Impala’s notion is to use Hadoop as a storage engine but move away from MapReduce algorithms. Instead, Impala uses distributed queries, a concept inherited from massive parallel processing databases. A
![Integrating R with Cloudera Impala for Real-Time Queries on Hadoop](https://cdn-ak-scissors.b.st-hatena.com/image/square/7b58308fadf3d4a51e12407cecded7a2fdfe5392/height=288;version=1;width=512/https%3A%2F%2Fbighadoop.wordpress.com%2Fwp-content%2Fuploads%2F2013%2F11%2Fimpala-architecture.jpeg)