pcodのブックマーク / 2008年6月23日 - はてなブックマーク

pcod id:pcod

2008年6月23日のブックマーク (7件)

文字列の ID 化と相互変換を SUFARY を使って行う方法
文字列の ID 化と相互変換を SUFARY を使って行う方法 2008-04-10-2 [Programming] 例えば、巨大なコーパスに対して出現する単語の共起情報を計算するときなどに、 (1) 各単語をあらかじめ ID (例えば整数) に変換して、 (2) その ID で内部処理を行い、結果をその ID で出力し、 (3) 出力結果の ID を元の単語に戻す、というロジックをよく使う。機械学習の学習データの feature や、ログデータ分析なんかもこのロジックでやったりする。 SUFARY を用いてこの作業を効率的に行う方法をメモ。速度よりも省ディスクスペースを優先する人向け。巨大な単語集合（例えば100万とか1000万とか1億とか）に有効。小規模なら、ありもののDBやハッシュで良いかと。まず準備。各行はキー文字列と付加情報をスペースでつないだもの。 mkary
pcod 2008/06/23
nlp

memo
リンク
Arab.jp はご購入できます！
短くて覚えやすいドメインでURLをつくることは、商品・サービス等の認知や広告効果という点において非常に重要なポイントとなります。
pcod 2008/06/23
nlp
リンク
Enkin: navigation reinvented
"Enkin" introduces a new handheld navigation concept. It displays location-based content in a unique way that bridges the gap between reality and classic map-like representations. It combines GPS, orientation sensors, 3D graphics, live video, several web services, and a novel user interface into an intuitive and light navigation system for mobile devices. Stay up to date with the latest developme
pcod 2008/06/23
googlemap

AR

gis
リンク
cmecab -- Mecab-Pyhton高速バインディング
cmecab -- Mecab-Python高速バインディング初出: 2007/7/14 Status: alpha MecabのPythonバインディングの改良高速版です。 SWIGを使わず、Mecabの最低限の機能だけをPython-C APIで実装しました。 mecab-pythonバインディングの以下のメソッドを実装しています。 createTagger Tagger.parseToNode Nodeからのデータ取得（surface, feature, posid, char_type, statのみ) →もう少しくだけた紹介はこちら。更新情報 →最新情報はこちらでどうぞ [2007/7/16] 多少性能改善。バージョン番号をつけました。0.1 [2007/7/15] 公開。ベンチマーク結果 1.5kb程度の同一の短いテキストを10000回形態素解析した結果を取得す
pcod 2008/06/23
python

memo
リンク
An introduction to MCMC for machine learning,C. Andrieu, et. al., Machine Learning, vol. 50, pp. 5--43, Jan. - Feb. 2003.
pcod 2008/06/23
research

paper
リンク
How to Write a Spelling Corrector
One week in 2007, two friends (Dean and Bill) independently told me they were amazed at Google's spelling correction. Type in a search like [speling] and Google instantly comes back with Showing results for: spelling. I thought Dean and Bill, being highly accomplished engineers and mathematicians, would have good intuitions about how this process works. But they didn't, and come to think of it, wh
pcod 2008/06/23
python

nlp
リンク
Natural Language Toolkit
Our managed detection and response (MDR) services stop business disruption from cybersecurity threats. For companies wanting to reduce risks and manage the cybersecurity of their teams Our team of highly trained cybersecurity professionals provides expertise in compliance, tool assessments, threat hunting, incident response and more. Critical Start is leading the way in Managed Detection and Respo
pcod 2008/06/23
python

nlp
リンク
- 2008年6月24日
- 2008年6月23日
- 2008年6月22日