[B! hyperLogLog] manboubirdのブックマーク

manboubird id:manboubird

hyperLogLogに関するmanboubirdのブックマーク (26)

Count-distinct problem - Wikipedia
manboubird 2020/11/14
hyperLogLog

algorithm

countDistinctProblem

approximateQuery

dataSketch
リンク
GitHub - google/zetasketch: A collection of libraries for single-pass, distributed, sublinear-space approximate aggregation and sketching algorithms. Currently: HyperLogLog++; more to come.
manboubird 2020/11/14
hyperLogLog

approximateQuery

zetasketch

sketch

google

algorithm

dataSketch
リンク
Efficient Aggregation, Roll-ups with BigQuery HyperLogLog++ functions
manboubird 2019/08/19
hyperLogLog

bigQuery
リンク
KHyperLogLog: Estimating Reidentifiability and Joinability of Large Data at Scale
Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more
manboubird 2019/05/18
KHyperLogLog: Estimating Reidentifiability and Joinability of Large Data at Scale

paper

hyperLogLog

google

dataManagement

join
リンク
Data Sketching - ACM Queue
May 31, 2017 Volume 15, issue 2 PDF Data Sketching The approximate approach is often faster and more efficient. Graham Cormode Do you ever feel overwhelmed by an unending stream of information? It can seem like a barrage of new em ail and text messages demands constant attention, and there are also phone calls to pick up, articles to read, and knocks on the door to answer. Putting these pieces toge
manboubird 2019/03/09
dataSketching

sql

acmqueue

BloomFilter

probabilisticDataStructure

hyperLogLog

algorithm
リンク
Counting uniques faster in BigQuery with HyperLogLog++ | Google Cloud Blog
manboubird 2018/06/03
bigQuery

hyperLogLog

approximationQuery
リンク
DataSketches |
manboubird 2018/05/04
dataSketches

yahoo

paper

links

hive

udf

hyperLogLog

sketch
リンク
Approximate Algorithms in Apache Spark: HyperLogLog and Quantiles
manboubird 2016/06/04
Spark

hyperLogLog
リンク
https://dl.acm.org/citation.cfm?doid=2452376.2452456
manboubird 2015/07/15
HyperLogLog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm

paper

edbt

hyperLogLog

algorithm
リンク
Tableau Research
About Tableau Research is an industrial research team focused on Tableau’s mission of helping people see and understand data. We actively work to be a source of new and inspiring product and techno logy directions, generating ideas that influence, drive, or significantly change what Tableau delivers to customers. We are also active members of the academic community, where we regularly publish and p
manboubird 2015/06/03
On Improving User Response Times in Tableau

paper

tableau

sigmod
リンク
乱択データ構造の最新事情 −MinHash と HyperLogLog の最近の進歩− - iwiwiの日記
本日，PFI セミナーにて「乱択データ構造の最新事情 −MinHash と HyperLogLog の最近の進歩−」というタイトルで話をさせてもらいました．スライドは以下になります． Ustream の録画もあります． http://www.ustream.tv/recorded/48151077 内容としては，以下の操作を効率的に行うための集合に関するデータ構造 (Sketch) の最近の進歩を紹介しました．集合の類似度の推定 (Jaccard 係数) 集合異なり数の推定 (distinct counting) どちらも重要かつ基礎的な操作で，b-bit MinHash や HyperLogLog など，既に実用的な手法が提案されており，実際にも使われています．しかし，2014 年になって，Odd Sketch や HIP Estimator という，これらをさらに改善する手法が立て続
manboubird 2014/12/03
hyperLogLog

minhash

slide
リンク
乱択データ構造の最新事情－MinHash と HyperLogLog の最近の進歩－
MinHash, b-bit MinHash, HyperLogLog, Odd Sketch, HIP Estimator の解説です．Read less
manboubird 2014/08/27
hyperLogLog

slide

algorithm
リンク
HyperLogLog in Pure SQL | Sisense
Using probabilistic counting, we’ll make count distinct even faster, trading a little accuracy for the increase in speed.
manboubird 2014/06/25
hyperLogLog

sql
リンク
HyperLogLogで遊ぶ - Negative/Positive Thinking
はじめに「さぁ、お前の罪の異なり数を数えろ！」と言われたときに使えそうな「HyperLogLog」という異なり数をカウントする方法を教えてもらったので、遊んでみた。いつもながら論文ちゃんと読んでないので、条件やコード間違ってるかも。。。 HyperLogLogとは cardinalityと呼ばれる、要素の異なり数を決定する問題かなり省メモリで精度のよい異なり数を推定できる方法要素をそのまま保存せず、ハッシュ値に変換したものをうまくレジスタに保存しておくので、レジスタサイズ程度しかメモリを使わない並列化もできて、最近のbigdataとかで注目されているまた、googleが並列計算用に改善したHyperLogLogを提案してるみたい http://blog.aggregateknowledge.com/2013/01/24/hyperloglog-googles-take-on-
manboubird 2014/04/01
hyperLogLog
リンク
Screen6 - HyperLogLog with Cascalog
We’ll look briefly in how you would utilize awesomeness of both Cascalog and HyperLogLog in order to execute Hadoop M/R tasks with amounts of data too big to have them in their original form. Intro HyperLogLog Cardinality estimator allowing you to count amount of distinct values. Cascalog The main use cases for Cascalog are processing "Big Data" on top of Hadoop or doing analysis on your local com
manboubird 2013/11/20
hyperLogLog

cascalog
リンク
https://tech.nextroll.com/media/hllminhash.pdf
manboubird 2013/09/28
HyperLogLog and MinHash A Union for Intersections, Andrew Pascoe

hyperLogLog

minhash

algorithm

paper

adRoll
リンク
GitHub - MLnick/hive-udf: Approximate cardinality estimation with HyperLogLog, as a Hive function
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
manboubird 2013/02/07
hive

hyperLogLog

algorithm

udf
リンク
Probabilistic Data Structures for Web Analytics and Data Mining
Statistical analysis and mining of huge multi-terabyte data sets is a common task nowadays, especially in the areas like web analytics and Internet advertising. Analysis of such large data sets often requires powerful distributed data stores like Hadoop and heavy data processing with techniques like MapReduce. This approach often leads to heavyweight high-latency analytical processes and poor appl
manboubird 2013/02/07
hyperLogLog

algorithm

analytics

countSketch
リンク
http://research.google.com/pubs/archive/40671.pdf
manboubird 2013/02/07
paper

hyperLogLog

algorithm

google
リンク
HyperLogLog++: Google’s Take On Engineering HLL –
Matt Abrams recently pointed me to Google’s excellent paper “HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm” [UPDATE: changed the link to the paper version without typos] and I thought I’d share my take on it and explain a few points that I had trouble getting through the first time. The paper offers a few interesting improvements that are w
manboubird 2013/02/07
hyperLogLog

algorithm

google

paper
リンク
1 2 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx