[B! Impala] [2ページ] yassのブックマーク

yass id:yass

Impalaに関するyassのブックマーク (41)

Comparing Stinger to Impala | Data Scientists
yass 2014/02/11
hadoop

impala

stinger
リンク
オープンソースのSQL-in-Hadoopソリューション:我々はいまどこに？
Spring BootによるAPIバックエンド構築実践ガイド第2版何千人もの開発者が、InfoQのミニブック「Practical Guide to Building an API Back End with Spring Boot」から、Spring Bootを使ったREST API構築の基礎を学んだ。この本では、出版時に新しくリリースされたバージョンである Spring Boot 2 を使用している。しかし、Spring Boot3が最近リリースされ、重要な変...
yass 2014/01/16
hadoop

sql

drill

presto

impala
リンク
Impala Performance Update: Now Reaching DBMS-Class Speed - Cloudera Blog
Impala’s speed now beats the fastest SQL-on-Hadoop alternatives. Test for yourself! Since the initial beta release of Cloudera Impala more than one year ago (October 2012), we’ve been committed to regularly updating you about its evolution into the standard for running interactive SQL queries across data in Apache Hadoop and Hadoop-based enterprise data hubs. To briefly recap where we are today: I
yass 2014/01/14
" Impala outperformed Hive by 6x to 69x (and by an average of 24x) "

impala

performance

orcfile

parquet

hive

stinger
リンク
Cloudera Impalaのアーキテクチャ
（本ブログは若干古くなっているので、Impala情報ページをご覧下さい。比較的新しい情報をまとめています）一人アドベントカレンダー２５日目、最終日です。最終日はCloudera Impala（以下Impala）について。Impalaは分散クエリエンジンです。最近EMRでも利用できるようになりました。 Hiveとは何が違うのか、なぜHiveを高速化しなかったのかというような意見もあるようですが、その答えはClouderaの創業者でもあるMike Olsonが今週公開したブログ（Impala v Hive）に詳しく書かれています。かなり興味深い内容ですが、今のところ英語のみです。きっと日本語の記事もいずれ読めるようになるはず。。。さて、最終日はCloudera Impalaのアーキテクチャについて書いてみます。引用している資料はSlideshareでClouderaが公開しているものです
yass 2013/12/27
impala

hadoop

cloudera
リンク
Announcing Support for Impala with Amazon Elastic MapReduce
We are pleased to announce support for Impala with Amazon EMR. Impala is an open source tool for real-time, ad hoc querying using a familiar SQL-like language. By using Impala on Amazon EMR, you can perform fast interactive analytics on unstructured data. For many types of queries, it's much faster than Hive. Impala's performance makes it a great engine for iterative queries and many popular BI to
yass 2013/12/14
hadoop

emr

aws

impala
リンク
『Amebaのログ解析基盤にCloudera ImpalaとPrestoを導入しました』
(この記事は、Hadoop Advent Calender 2013 の12日目の記事です) こんにちは、Amebaのログ解析基盤Patriotの運用をしている、鈴木(@brfrn169)と柿島大貴です。 Patriotについては以下をご覧ください。 http://ameblo.jp/principia-ca/entry-10635727790.html http://www.slideshare.net/cyberagent/cloudera-world-tokyo-2013 今回、Amebaのログ解析基盤PatriotにCloudera ImpalaとPrestoを導入しました。 Cloudera ImpalaとPrestoのインストール方法や詳細ついては、下記URLをご覧ください。 Cloudera Impala http://www.cloudera.com/content/clo
yass 2013/12/13
" CPU Xeon E5-2650L 0 @ 1.80GHz * 2 / ディスク SATA 3TB × 12本（JBOD）, SAS300GB * 2 (RAID1) / メモリ 96GB（メモリの内訳は DataNode 4GB, TaskTracker 1.5GB, RegionServer 8GB, Cloudera Impala 5GB, Presto 5GB, その他 MapReduceのスロット用、OS用）"

presto

Impala

Hadoop

server
リンク
Big SQL Analytics – Cloudera Using Impala | StatSlice Business Intelligence and Analytics | Business Intelligence and Analytics Dallas | Business Intelligence Training Dallas
This is an overview of the webinar on Cloudera using Impala presented by Brett Neuman on October 22, 2013. For more information on this webinar, click here to view the video or continue reading to below. Overview In this webinar, Brett gives you an overview of Cloudera’s Impala platform and how you can use it in your business. He will present a customer case study and compares query results from I
yass 2013/11/05
impala

RedShift

toread
リンク
Big Data Benchmark
Click Here for the previous version of the benchmark Introduction Several analytic frameworks have been announced in the last year. Among them are inexpensive data-warehousing solutions based on traditional Massively Parallel Processor (MPP) architectures (Redshift), systems which impose MPP-like execution engines on top of Hadoop (Impala, HAWQ), and systems which optimize MapReduce to improve per
yass 2013/11/04
RedShift

shark

impala

Hive

Benchmark

comparison

Spark

stinger

tez
リンク
Teradata Presto | Product Details | Open Source
Teradata Blogs When big data becomes vast, what's your data dropping strategy? Read more Support Teradata at Your Service (TAYS) Simple, secure customer access to products, services, education, and support function information. Read more Certifications Teradata Certified Professional Program (TCPP) Management, development, and oversight of the premiere Teradata Certification Program. Read more Con
yass 2013/11/02
" SQL processed by a specialized (Google-inspired) SQL engine that sits on a Hadoop cluster. Both Impala and Drill fall into this category. Impala is inspired by Google’s F1 project and Drill by Google’s Dremel project. "

hadoop

impala

drill

stinger

hadapt

hive

sql
リンク
Inside Impala -Query Exec Engine- Twitter: @oza_x86 12年11月28日水曜日 111111 • Query Exec Engine (クエリ実行部)の部分を中心に読む • 引用元: http://www.slideshare.net/shiumachi/impala-15324018/22 12年11月28日水曜日目�
Inside Impala -Query Exec Engine- Twitter: @oza_x86 12年11月28日水曜日 111111 • Query Exec Engine (クエリ実行部)の部分を中心に読む • 引用元: http://www.slideshare.net/shiumachi/impala-15324018/22 12年11月28日水曜日目次 • Query Engine Executor の役割 • Query Engine Executor の実装 • Plan とは • Node とは • 入力単位(row_batch) 12年11月28日水曜日前提 • SQL は FE (FrontEnd)でパース済みであり， Thrift で定義された Plan 構造体に変換済み． • どこで何が実行されるかは，Coordinator が決める． →@rep
yass 2013/10/03
impala
リンク
Presentations from the Cloudera Impala meetup on Aug 20 2013
This is a technical deep dive about Cloudera Impala, the project that makes scala ble parallel databse techno logy available to the Hadoop community for the first time. Impala is an open-sourced code base that allows users to issue low-latency queries to data stored in HDFS and Apache HBase using familiar SQL operators. Presenter Marcel Kornacker, creator of Impala, begins with an overview of Impala
yass 2013/09/30
Parquet 2.0

impala

parquet

cloudera
リンク
Evaluation of cloudera impala 1.1
This document evaluates the performance of Cloudera Impala 1.1 using two clusters. It finds that RCFile with Snappy compression provides the fastest performance for both Hive and Impala on the clusters for reading-only workloads. Parquet with Snappy may be fastest for larger tables. Issues were identified with memory limits during Parquet table creation and were later fixed. The evaluation shows I
yass 2013/09/30
impala

hive

parquet

benchmark

rcfile

snappy
リンク
Cloudera Blog
Enterprise IT leaders across industries are tasked with preparing their organizations for the techno logies of the future – which is no simple task. With the use of AI exploding, Cloudera, in partnership with Researchscape, surveyed 600 IT leaders who work at companies with over 1,000 employees in the U.S., EMEA and APAC regions. The survey, […] Read blog post
yass 2013/09/30
impala

LLVM

toread

cloudera
リンク
What’s Next for Impala After Release 1.1 - Cloudera Blog
In December 2012, while Cloudera Impala was still in its beta phase, we provided a roadmap for planned functionality in the production release. In the same spirit of keeping Impala users, customers, and enthusiasts well informed, this post provides an updated roadmap for upcoming releases later this year and in early 2014. But first, a thank-you: Since the initial beta release, we’ve received a tr
yass 2013/09/25
impala

roadmap
リンク
Hadoop運用管理の今
Hadoopの最新状況 2006年、Hadoopはウェブのインデックス処理を行うために開発されました。その後さまざまな用途に利用されるようになり、それに伴いパフォーマンスの改善、セキュリティの強化、Hadoopを効率よく利用するためのエコシステムも多く誕生しました。今回は、そのうちのいくつかについて紹介します。 1) マスターノード単一障害点の解消 2) Impala - Hadoopの高速クエリエンジン 3) Hadoop運用管理ツール、Cloudera Manager 単一障害点(SPOF)の解消 Hadoopには単一障害点があるから怖くて使えない、という印象をお持ちの方はいらっしゃるのではないでしょうか？以前のバージョンのHadoopにはそのような問題がありました。（前回のコラムを参照）。単一障害点を解消するためにLinuxのクラスタソフトウェア(PacemakerやRed Hat
yass 2013/06/26
hadoop

impala

CDH

HA
リンク
Parquet
Documentation Download Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides high performance compression and encoding schemes to handle complex data in bulk and is supported in many programming language and analytics tools.
yass 2013/03/13
hadoop

column oriented database

cloudera

twitter

toread
リンク
カラムナストレージ - Yet Another HDIF?
Disclaimer: The opinions expressed here are my own and do not necessarily represent those of current or past employers.Twitter / Photos Disclaimer: The opinions expressed here are my own and do not necessarily represent those of current or past employers. Twitter / Photos Henry Robinsonによる、カラムナストレージの解説記事を翻訳しました。カラムナストレージは、Googleで開発されたデータ処理ツールであるDremelに使用されているファイルフォーマットであり、Clouderaが開発を進めるImpalaでも採用
yass 2013/02/13
"Henry Robinsonによるカラムナストレージの解説記事を翻訳しました。カラムナストレージは、Googleで開発されたデータ処理ツールであるDremelに使用されているファイルフォーマット / Clouderaが開発を進めるImpalaでも採用が予定"

column oriented database

hadoop

impala

filesystem

columnar storage
リンク
Impala Q&A - still deeper
2012/11/7に開催されたCloudera World Tokyoに参加してきました。本編については他の人がまとめてくれるはずですので省略。懇親会では米国Cloudera社のCTO、Dr. Amr Awadallah氏に直接Impalaの疑問に答えていただきました。非常に貴重な話を聞けたのでまとめておきます。（公開許可済み）その場でメモを取っていたわけではなく思い出しながらのまとめなので、一緒に聞いていた方、clouderaの方は補足をお願いします。 Q&A Q. なぜJavaでなくてC++で実装したか？ A. ImpalaのメインデザイナーがGoogleでC++を使って分散処理(Dremelのこと？)を実装した人物であるのと、JVMの起動コストがレイテンシーの増加につながるため補足: この人でしょうか Q. 1ノードに偏ったデータを読む必要があるクエリがくると低レイテンシーを
yass 2013/02/06
"現場ではRCFileの方がよいが、将来的にはtrevniの方がパフォーマンスがよいので推奨。RCFileとtrevniに仕様上の大きな違いはないが、Doug Cuttingによるtrevniの実装が優れている。"

hadoop

impala

rcfile

trevni
リンク
Hadoop用リアルタイムクエリエンジン Impalaのポテンシャルをレビューした
Hadoop用リアルタイムクエリエンジン Impalaのポテンシャルをレビューした：Databaseテクノロジレポート（1/4 ページ） 2012年10月24日に発表されたばかりのHadoop用リアルタイムクエリエンジンをいち早くレビュー。次期CDHに組み込まれる予定の新機能をどう使いこなす？ Impalaとは Impalaは、Googleが社内で利用しているDremelとF1にインスパイアされて開発されたオープンソースソフトウェアで、HDFS（Hadoop Distributed File System）あるいはApache HBaseに保存されているデータを対象に、アドホックなクエリを実行するためのツールです。Hadoopのディストリビューションベンダとして有名なClouderaが開発しています*1。 Hadoopファミリのソフトウェアは基本的にJava言語で開発されていますが、Imp
yass 2012/12/10
impala

hadoop
リンク
Cloudera Impala発表資料 | 外道父の匠
11/26 の『Hadoopソースコードリーディング第13回』でCloudera Impalaの発表をしてきました。きっかけはTwitter上で、ビールの化身も◯すの外道父を呼べば？から始まって、１分かからず依頼ツィートが飛んできて引き受けた感じで、Twitterで数分で全てが完結する非常にフットワークの軽い業界になります。それでは、発表資料や補足などを書いていきます。リンク Eventbrite : Hadoopソースコードリーディング第13回 Twitter #hadoopreading togetter : Hadoopソースコードリーディング第13回まとめ Inside Impala Coordinator at HSCR 13th – Go ahead! by @repeatedly Inside Impala -Query Exec Engine- by @o
yass 2012/11/27
impala

hadoop

cloudera
リンク
前のページ 1 2 3 次のページ