[B! 検索] grafiのブックマーク

YaCy is free software for your own search engine. Join a community of search engines or make your own search portal! There are these three use cases you can choose from: P2P Mode Web Search by the people, for the people: decentralized, all users are equal, no central, no search request storage, shared index. Your Search Portal Your YaCy installation is independent from other peers. Define your own

grafi 2012/04/30

リンク

TechCrunch | Startup and Technology News

Care/of, a company offering personalized subscription vitamin packs, says it will be canceling all subscriptions as of Monday, June 17 and will no longer be accepting new orders. The news…

grafi 2011/12/11

リンク

30分でわかる高性能な圧縮符号vertical code - EchizenBlog-Zwei

検索エンジンの転置インデックスなどデータ列を小さいデータサイズで持たせたい、という状況がある。こういう場合圧縮符号を使うのが一般的でunary符号やgamma符号、delta符号など様々な種類がある。圧縮符号の中でイチオシなのがvertical code(vcode)。これは岡野原(@hillbig)氏によって提案された圧縮符号で単純な仕組みでdelta符号並の性能を誇っている。本記事ではvcodeのポイントを絞って30分でわかるように解説してみる。 vcodeは本棚に本を並べる作業を連想すると理解しやすい。本棚は予め高さが決まっているので全ての本が入るような本棚を用意する。つまりというようなものを想像する。この本棚は8冊の本が並んでいるが左から5冊目の本が他よりも背が高い。このため5冊目の本に合わせて背の高い本棚が必要になる。だが他の本は5冊目の本ほどに背が高くないので、5冊目が

grafi 2011/12/10

リンク

TechCrunch

Happy Saturday, folks, and welcome to Week in Review (WiR), TechCrunch’s newsletter that covers the major stories in tech over the past several days. I feel inclined to begin this edition with a

grafi 2011/11/23

リンク

Substring search algorithm

Described new online substring search algorithm which allows faster string traversal. Presented here implementation is substantially faster than any other online substring search algorithms for average case. Substring (needle) SS of length M is sought in source (haystack) string S of length N. Algorithm sequentially steps through string S, and probes word W (2 or more bytes) if it belongs to SS. S

grafi 2011/11/22

リンク

Wolfram|Alpha: Making the world’s knowledge computable

Compute expert-level answers using Wolfram’s breakthrough algorithms, knowledgebase and AI techno logy Mathematics ›Step-by-Step SolutionsElementary MathAlgebraPlotting & GraphicsCalculus & AnalysisGeometryDifferential EquationsStatisticsMore Topics »Science & Techno logy ›Units & MeasuresPhysicsChemistryEngineeringComputational SciencesEarth SciencesMaterialsTransportationMore Topics »Society & Cul

grafi 2011/11/18

リンク

Apache Solr を利用した検索パッケージ Anuenue - mixi engineer blog

研究開発グループの takahi-i です。先日名前だけご紹介したAnuenue というツールをご紹介させていただきます。Anuenue は Apache Solr のラッパーであり、検索クラスタの構築と運用を容易にする目的で制作されました。本稿では始めに Apache Solr を選択した理由について述べ、その後、このツールを開発した背景とその目的をご紹介させていただきます。後半では実際に Anuenue を用いて検索クラスタを立ち上げます。なぜ Apache Solr を採用したのか昨年の秋、弊社の検索エンジンを置き換えるという計画が社内で策定され、ベースとなる検索エンジンの選定のために多くの OSS 検索エンジンを比較検討しました。このとき重視したのは一台の検索パフォーマンスと同時に、保守の容易さと、開発コミュニティの規模です。検索エンジンの保守性に関して特に重要と考えたの

grafi 2011/10/27

リンク

はてなブックマーク全文検索機能の裏側

そろそろ落ち着いて来たころ合いなので、はてなブックマーク全文検索機能の裏側について書いてみることにします。 PFI側は、8月ぐらいからバイトに来てもらっているid:nobu-qと、id:kzkの2人がメインになって進めました(参考: 制作スタッフ)。数学的な所は他のメンバーに色々と助言をしてもらいました。はてな側は主にid:naoyaさんを中心に、こちらの希望や要求を聞いて頂きました。開発期間は大体1〜2か月ぐらいで、9月の上旬に一度id:naoyaさんにオフィスに来て頂いて合宿をしました。その他の開発はSkypeのチャットで連絡を取りながら進めてました。インフラ面ではid:stanakaさん、契約面ではid:jkondoさん、id:kossyさんにお世話になりました。全文検索エンジンSedue 今回の検索エンジンはSedue(セデュー)という製品をベースにして構築しています。Sedu

grafi 2011/08/05

リンク

KMP法+ボイアームーア=有限オートマトン - 簡潔なQ

という気がしてきたので、文字列検索の有限オートマトン化をしてみた。 …普通にこれでいいじゃん！有限オートマトンの生成にかかる時間およびメモリはn=キーサイズ,c=文字の種類とおいてO(cn)なので、そこで少し遅れをとるけど、検索が非常にシンプルで高速(検索対象文字列を一回しか参照しない)になる。検索文字列が小さくて、検索対象文字列が大きいときに有効だと思う。 #include <cstdio> void simple_search(const char *str, const char *key) { for(int i = 0; str[i]; i++) { int j; for(j = 0; str[i+j] && key[j] && str[i+j]==key[j]; j++) { } if(!key[j]) { printf("A: %d\n", i); } } } void