Live suggestions as you type into a search box, sometimes called suggest or autocomplete, is now a standard, essential search feature ever since Google set a high bar after going live just over four years ago. In Lucene we have several different suggest implementations, under the suggest module; today I'm describing the new AnalyzingSuggester (to be committed soon; it should be available in 4.1).
There are many exciting improvements in Lucene's eventual 4.0 (trunk) release, but the awesome speedup to FuzzyQuery really stands out, not only from its incredible gains but also because of the amazing behind-the-scenes story of how it all came to be. FuzzyQuery matches terms "close" to a specified base term: you specify an allowed maximum edit distance, and any terms within that edit distance fr
If you've ever wondered how Lucene picks segments to merge during indexing, it looks something like this: That video displays segment merges while indexing the entire Wikipedia (English) export (29 GB plain text), played back at ~8X real-time. Each segment is a bar, whose height is the size (in MB) of the segment (log-scale). Segments on the left are largest; as new segments are flushed, they appe
While indexing, Lucene periodically merges multiple segments in the index into a single larger segment. This keeps the number of segments relatively contained (important for search performance), and also reclaims disk space for any deleted docs on those segments. However, it has a well known problem: the merging process evicts pages from the OS's buffer cache. The eviction is ~2X the size of the m
リリース、障害情報などのサービスのお知らせ
最新の人気エントリーの配信
処理を実行中です
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く