[B! algorithm] [5ページ] yassのブックマーク

yass id:yass

algorithmに関するyassのブックマーク (155)

あなたの知らないハッシュテーブルの世界
Please select the category that most closely reflects your concern about the presentation, so that we can review it and determine whether it violates our Terms of Use or isn't appropriate for all viewers.
yass 2012/12/06
Hash

algorithm
リンク
へ、変態っ！！読めないからやめてっ！bit使ったデータ構造・アルゴリズム実装集 - Negative/Positive Thinking
この記事はCompetitive Programming Advent Calendar Div2012の2日目の記事です。 12月20日追記： Darseinさんが20日目の記事で、ビット演算についての詳しい説明を紹介してくださっています！必読ですね！！！！：）はじめに Y＾´　　　　 ∨／／／,∠ ,. ' ／l／／／／, ' , '／ !　｜ｌ }´　　　　〈〉　　変　　〈/ , ' ／／￣｀>< ／／／／／／ _,.=‐|'"´l　ｌ〈　　変　　/ 〈　　　態.　　 ∨, '／l| 　 ,.'-‐､｀/／｀7/　／''"´__　｜　ﾊ l丿　態　 { 人）　　　! !　　　（／! 　|ヽ〈_　・.ﾉ〃　〃／　 '／⌒ヾ.!　,'　!く　　 ! !　　（_ ト､__／　ヽ､_,.イ　　／ｌ　ｌ　｜:::::::｀｀`/:::::／...´..
yass 2012/12/03
bit

algorithm

xorshift
リンク
Similarity Join Algorithms: An Introduction
yass 2012/11/28
similarity

join

algorithm

toread

LSH

minhash
リンク
Fast Intersection of Sorted Lists Using SSE Instructions
Intersection of sorted lists is a cornerstone operation in many applications including search engines and databases because indexes are often implemented using different types of sorted structures. At GridDynamics, we recently worked on a custom database for realtime web analytics where fast intersection of very large lists of IDs was a must for good performance. From a functional point of view, w
yass 2012/11/18
intersection

algorithm

hardware

integer

sort

java

lucene

sse

simd
リンク
Google Code Archive - Long-term storage for Google Code Project Hosting.
Code Archive Skip to content Google About Google Privacy Terms
yass 2012/11/18
similarity

sparse vector

google

algorithm
リンク
LCP(Longest Common Prefix)を用いたSuffix Arrayの検索 - EchizenBlog-Zwei
Suffix Arrayは「インデックスの構築」と「キーワードの検索」からなる。それぞれ構築には文字列のsortが、検索には文字列の二分探索が必要になる。以前にCompressed Suffix Arrayのライブラリtsubomiを実装したときにはsortについてはマルチキー・クイックソート(multikey-quicksort)というアルゴリズムを用いた。一方で二分探索については特に工夫をしていなかった。さすがにこのまま放っておくのは気が引けたのでSuffix Array論文を読みなおしてみたらLCP(Longest Common Prefix)を用いた二分探索の方法が書いてあった。シンプルだが賢い方法だったのでメモしておく。これはすごい(というか今まで読み飛ばしてたことのほうが問題ですね。はい)。さて。まずLCP(Longest Common Prefix)とは何かと言うとその
yass 2012/11/17
LCP

algorithm

data structure

suffix array

index
リンク
30分でわかる高性能な圧縮符号vertical code - EchizenBlog-Zwei
検索エンジンの転置インデックスなどデータ列を小さいデータサイズで持たせたい、という状況がある。こういう場合圧縮符号を使うのが一般的でunary符号やgamma符号、delta符号など様々な種類がある。圧縮符号の中でイチオシなのがvertical code(vcode)。これは岡野原(@hillbig)氏によって提案された圧縮符号で単純な仕組みでdelta符号並の性能を誇っている。本記事ではvcodeのポイントを絞って30分でわかるように解説してみる。 vcodeは本棚に本を並べる作業を連想すると理解しやすい。本棚は予め高さが決まっているので全ての本が入るような本棚を用意する。つまりというようなものを想像する。この本棚は8冊の本が並んでいるが左から5冊目の本が他よりも背が高い。このため5冊目の本に合わせて背の高い本棚が必要になる。だが他の本は5冊目の本ほどに背が高くないので、5冊目が
yass 2012/11/17
compression

algorithm

bit

encoding
リンク
γ符号、δ符号、ゴロム符号による圧縮効果 - naoyaのはてなダイアリー
通常の整数は 32 ビットは 4 バイトの固定長によるバイナリ符号ですが、小さな数字がたくさん出現し、大きな数字はほとんど出現しないという確率分布のもとでは無駄なビットが目立ちます。 Variable Byte Code (Byte Aligned 符号とも呼ばれます) は整数の符号化手法の一つで、この無駄を幾分解消します。詳しくは Introduction to Information Retrieval (以下 IIR) の第5章に掲載されています。(http://nlp.stanford.edu/IR-book/html/htmledition/variable-byte-codes-1.html で公開されています) Variable Byte Code はその名の通りバイトレベルの可変長符号で、1バイトの先頭1ビットを continuation ビットとして扱い、続く 7 ビット
yass 2012/11/12
" ゴロム符号はパラメータフリーなγ符号やδ符号とは異なり、符号長を調整するためのパラメータが必要になります。このパラメータは転置インデックスの場合、全体の文書数や単語の数から求めることができます "

integer

encoding

gamma coding

delta coding

compression

vByte

algorithm

Golomb coding
リンク
Jean-Philippe Aumasson
Cryptographer, co-founder & chief security officer @ Taurus. Books Serious Cryptography (No Starch Press, 2017) Translations' covers 🚧 Second edition: to appear in 2024 (No Starch Press) 🚧 French translation: to appear in 2024 (Dunod) Petit Pingo uin (self-published, 2021) Crypto Dictionary (No Starch Press, 2020) The Hash Function BLAKE (Springer, 2014) Crypto projects Hash functions BLAKE, BLAK
yass 2012/11/11
hash

SipHash

Algorithm
リンク
http://hpc.isti.cnr.it/
yass 2012/11/10
integer

encoding

algorithm

PForDelta

simple9

delta coding

comparison

gamma coding

vByte
リンク
差分符号化 - Wikipedia
差分符号化（さぶんふごうか、英: Delta encoding）とは、データの格納や転送を完全なファイルとしてではなく、シーケンシャルなデータの差分の形式で行う方式である。特に変更履歴の保存を目的とする場合（ソフトウェアプロジェクトなど）、差分符号化は差分圧縮（英: Delta compression）とも呼ばれる。デルタ符号化、デルタ圧縮とも呼ばれるが、デルタ符号とは異なる。例えばUNIXのファイル比較ユーティリティである diff などで「差分」または「デルタ」を作成し、個別にファイルとして記録する。差分は一般に元のファイルよりも小さいので、差分符号化によってデータの冗長性を大幅に削減できる。一連の差分ファイルの方が各バージョンのそのままのファイル群よりも大幅に記録容量が節約できる。論理的観点から言えば、2つのデータの差分があれば、一方のデータからもう一方のデータを得ることができる
yass 2012/11/10
algorithm

delta encoding

encoding

integer
リンク
サービス終了のお知らせ
サービス終了のお知らせいつもYahoo! JAPANのサービスをご利用いただき誠にありがとうございます。お客様がアクセスされたサービスは本日までにサービスを終了いたしました。今後ともYahoo! JAPANのサービスをご愛顧くださいますよう、よろしくお願いいたします。
yass 2012/11/10
encoding

algorithm

compression

integer

RLE
リンク
Compressed Permuterm Index: キーワード辞書検索のための多機能＆省メモリなデータ構造 - Preferred Networks Research & Development
はじめましてこんにちわ。 4月からPFIで働いているまるまる（丸山）です。最近のマイブームはスダチです。リサーチブログの更新が再開されたので、私も流れに乗って初ブログを書いてみようと思います。今回は社内の情報検索輪講で少し話題にあがったCompressed Permuterm Indexを紹介したいと思います。 Paolo Ferragina and Rossano Venturini. “The compressed permuterm index”, ACM Transactions on Algorithms 7(1): 10 (2010). [pdf] これを実装したので以下のgoogle codeに晒してみることにします。 http://code.google.com/p/cpi00/ 修正BSDライセンスです。ソースコードは好きにしてもらって構いませんが、完成度はまだまだな
yass 2012/11/06
algorithm

Data Structure

dictionary

search

index
リンク
1MB Sorting Explained
In my previous post, I shared some source code to sort one million 8-digit numbers in 1MB of RAM as an answer to this Stack Overflow question. The program works, but I didn’t explain how, leaving it as a kind of puzzle for the reader. I had promised to explain it in a followup post, and in the meantime, there’s been a flurry of discussion in the comments and on Reddit. In particular, commenter Ben
yass 2012/10/28
algorithm

sort

toread
リンク
情報系修士にもわかるダブル配列 - アスペ日記
最近話題の「日本語入力を支える技術」を途中まで読んだ。 3章がものすごく気合いが入っている。 trie（トライ）というデータ構造の2つの実装、「ダブル配列」と「LOUDS」について詳しく説明がされている。ダブル配列については、ぼくは以前論文を読んで勉強しようとしたのだが、その時は難しくてあきらめた覚えがある。しかし、この本の説明を読むことで理解ができた。ありがたい。感銘を受けたので、この本を教材に友達と2人勉強会をした。この2人勉強会というのは、ぼくが復習を兼ねて友達に教えるというのがだいたいのスタイル。しかし、いざやってみるといろいろと難しい。次のようなところでひっかかるようだ。例のサイズが小さく、イメージを喚起するのが難しい。最初の図のノード番号と、最終的なダブル配列上の位置が異なるため、混乱する。単語終端について言及がないので、どのノードが単語を表しているかがわから
yass 2012/10/11
DoubleArray

trie

algorithm
リンク
Double-Array Articles
ダブル配列のライブラリを公開しているページです． An Implementation of Double-Array Trie URL: http://linux.thai.net/~thep/datrie/datrie.html Darts: Double-ARray Trie System URL: http://chasen.org/~taku/software/darts/ Dame URL: http://www.void.in/wiki/Dame Tiny Double-Array Library URL: http://nanika.osonae.com/TinyDA/index.html Dynamic Double-Array Library URL: http://nanika.osonae.com/DynDA/index.html
yass 2012/10/08
trie

algorithm

DoubleArray
リンク
Wavelet Matrix | PDF
What is Scribd?AcademicProfessionalCultureHobbies & CraftsPersonal GrowthAll Documents
yass 2012/09/30
data structure

wavelet

algorithm

WaveletMatrix
リンク
Sort Benchmark Home Page
New: The deadline for the 2023 sort contest is 31 December 2023. Background Until 2007, the sort benchmarks were primarily defined, sponsored and administered by Jim Gray. Following Jim's disappearance at sea in January 2007, the sort benchmarks have been continued by a committee of past colleagues and sort benchmark winners. The Sort Benchmark committee members include: Chris Nyberg of Ordinal Te
yass 2012/09/25
sort

algorithm

benchmark

comparison

toread
リンク
To Trie or not to Trie – a comparison of efficient data structures
Since my discussion thread on the efficiency of the in-memory data structure of ZeroMQ with Martin Sustrik, I have been reading up a bit by bit on efficient data structures, primarily from the perspective of memory utilization. Data structures that provide constant lookup time with minimal memory utilization can give a significant performance boost since access to CPU cache is considerably faster
yass 2012/09/21
data structure

algorithm

memory

trie

ternary search tree
リンク
Google Code Archive - Long-term storage for Google Code Project Hosting.
Code Archive Skip to content Google About Google Privacy Terms
yass 2012/09/21
algorithm

comparison

java

memory

tree

trie

kd-tree

skip list

red-black tree
リンク
前のページ 1 2 3 4 5 6 7 8 次のページ