[B! presto] msyktのブックマーク

Dynamic filtering for highly-selective join optimization

msykt 2019/07/25

“Our idea was to extend Presto’s predicate pushdown support from the planning phase to run-time, in order to skip reading the non-relevant rows from our connector into Presto1. ”

presto

リンク

Engineering Data Analytics with Presto and Parquet at Uber

Data / MLEngineering Data Analytics with Presto and Apache Parquet at UberJuly 11, 2017 / Global From determining the most convenient rider pickup points to predicting the fastest routes, Uber uses data-driven analytics to create seamless trip experiences. Within engineering, analytics inform decision-making processes across the board. As we expand to new markets, the ability to accurately and qui

msykt 2018/11/22

リンク

GitHub - Netflix/iceberg: Iceberg is a table format for large, slow-moving tabular data

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

msykt 2018/01/06

NetflixがHadoop系データで更新が少ないデータをターゲットとしたテーブルフォーマットを作っているらしい

リンク

Hadoop Query Performance Smackdown

The document presents a comprehensive performance evaluation of various SQL query engines within a big data environment at Comcast, detailing the test setup and methodology used to assess their efficiency with TPC-DS datasets. Key findings reveal that LLAP exhibited the fastest execution times, outperforming Presto and Tez, while MapReduce and the Spark Thrift Server were identified as underperfor

msykt 2017/07/06

今日の報告会で紹介されてたクエリエンジン比較の資料 #hadoopreading

リンク

Prestoで実現するインタラクティブクエリ - dbtech showcase 2014 Tokyo

Treasure DataではFluentdなどで収集したデータに対し、Prestoによる低レイテンシクエリサービスを提供しています。これによりユーザーはすばやくデータに関する知見を得ることができ、データ分析の生産性を向上できます。このスライドでは分散SQLエンジンであるPrestoの特徴とその実装について紹介します。この内容はdbtech showcase 2014 Tokyo @秋葉原UDX で紹介しました。 http://www.insight-tec.com/dbts-tokyo-2014.html

msykt 2014/12/23

prestogres面白い。インターフェースがpostgresというのとは違うのかー

presto

リンク

Python + Hive on AWS EMR で貧者のログサマリ

Sep 14, 201422 likes6,707 viewsAI-enhanced description 1. Akira Chiku is an engineer who works on an engineering team. Their requirements include collecting between 10-20GB of data per day from various sources like Hadoop and Hive. 2. Data is collected from sources like Fluentd and parsed using Query String and stored in Hive. It is then processed and visualized. 3. Data can be stored in S3, proce

msykt 2014/09/16

勉強になる/“ImpalaとPrestoを比較し、S3にも直接クエリを投げれるPrestoを導入した。(Impalaも次期バージョンではS3に直接クエリ投げれるらしいのでその時に再度検証予定)”

リンク

MPP on Hadoop, Redshift, BigQuery - Go ahead!

Twitterで「早く今流行のMPPの大まかな使い方の違い書けよ！」というプレッシャーが半端ないのでてきとうに書きます．この記事は俺の経験と勉強会などでユーザから聞いた話をもとに書いているので，すべてが俺の経験ではありません(特にBigQuery)．各社のSAの人とかに聞けば，もっと良いアプローチとか詳細を教えてくれるかもしれません．オンプレミスの商用MPPは使ったことないのでノーコメントです． MPP on HadoopでPrestoがメインなのは今一番使っているからで，Impalaなど他のMPP on Hadoop的なものも似たような感じかなと思っています．もちろん実装の違いなどがあるので，その辺は適宜自分で補間してください．前提アプリケーションを開発していて，そのための解析基盤を一から作る．簡単なまとめデータを貯める所が作れるのであれば，そこに直接クエリを投げられるPre

msykt 2014/07/24

知りたいと思ってた内容が見事にまとめられてる…。素晴らしい

リンク

What are the main differences between Facebook, Presto, and Amplab Shark?

Answer (1 of 2): 1. Primary Use Case: While both are intended for analytics, Shark's primary use case is providing SQL to an (extremely fast) in-memory database, with support also for on-disk (or abstract) data sources. Presto is designed to be a fast SQL engine for the latter, and does not have ...

msykt 2014/06/03

ちょっと古い情報だけど参考になる

リンク

Prestoソースコードリーディング

Presto: http://prestodb.io/ Presto: Interacting with petabytes of data at Facebook https://www.facebook.com/notes/facebook-engineering/presto-interacting-with-petabytes-of-data-at-facebook/10151786197628920 続きを読む

msykt 2014/06/02

presto

リンク

はてなブックマーク

タグ

関連タグで絞り込む (10)

prestoに関するmsyktのブックマーク (9)

お知らせ

今週のはてなブックマーク数ランキング（2025年8月第2週）

今週のはてなブックマーク数ランキング（2025年8月第1週）

月間はてなブックマーク数ランキング（2025年7月）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス