cantaloupeのブックマーク / 2016年1月6日

cantaloupe id:cantaloupe

2016年1月6日のブックマーク (7件)

Doc2Vec tutorial using Gensim
cantaloupe 2016/01/06
リンク
Doc2vec tutorial | RARE Technologies
The latest gensim release of 0.10.3 has a new class named Doc2Vec. All credit for this class, which is an implementation of Quoc Le & Tomáš Mikolov: “Distributed Representations of Sentences and Documents”, as well as for this tutorial, goes to the illustrious Tim Emerick. Doc2vec (aka paragraph2vec, aka sentence embeddings) modifies the word2vec algorithm to unsupervised learning of continuous re
cantaloupe 2016/01/06
doc2vec
リンク
Word2Vecの進化形Doc2Vecで文章と文章の類似度を算出する - Qiita
■ doc2vec.pyをカスタマイズ変更点① デフォルトのdoc2vec.pyだと、レスポンスのときのlabelがカスタマイズできなかったので、設定したlabelで結果を呼び出せるように変更してみました。変更点② doc2vec.pyのデフォルトでは、文書の似ているものは？って叩くと、文書も単語も出力されてしまうので、文書の似ている文書だけを出力するメソッドも作成しました。 #!/usr/bin/env python # -*- coding: utf-8 -*- # # Copyright (C) 2013 Radim Rehurek <me@radimrehurek.com> # Licensed under the GNU LGPL v2.1 - http://www.gnu.org/licenses/lgpl.html """ Deep learning via the d
cantaloupe 2016/01/06
doc2vec
リンク
Gensim: topic modelling for humans
class gensim.models.doc2vec.Doc2Vec(documents=None, corpus_file=None, vector_size=100, dm_mean=None, dm=1, dbow_words=0, dm_concat=0, dm_tag_count=1, dv=None, dv_mapfile=None, comment=None, trim_rule=None, callbacks=(), window=5, epochs=10, shrink_windows=True, **kwargs)¶ Bases: Word2Vec Class for training, using and evaluating neural networks described in Distributed Representations of Sentences
cantaloupe 2016/01/06
doc2vec
リンク
Wikipedia（英語）をWord2Vecに突っ込む - ぼろぼろ平原
2015 - 10 - 24 Wikipedia（英語）をWord2Vecに突っ込む今更Word2Vecシリーーズっ！2 用意するもの enwiki-*-pages-articles.xml.bz2 ここからダウンロードできる: Index of /enwiki/ 今回は2015-04-03のデータを使った Python 2.7 + gensim + pattern # patternのインストール $ pip install pattern 最初は Python 3でやってたけどpatternはまだ Python 3に対応してなかった。ファイルの変換最初に、 XML フォーマットをテキストフォーマットに変換する。この時にlemmatizeも同時に行う。以下のスクリプトを作成する。 process_wiki.py #!/usr/bin/env python # -*- codi
cantaloupe 2016/01/06
word2vec
リンク
word2vec - RupyWiki
プログラム HTML SCSS MarkDown Haml JavaScript └node.js C言語 └make C++ Ruby └gem └Nokogiri └sqlite3-ruby └ruby-opencv └railsインストール └rails └rails gem └devise └rails model └rails view └rails controller └Passenger └Capistrano └bootstrap └rbenv └ruby時間計測 └RSpec └Gviz └google search PHP └CodeIgniter └CGI R言語 └Rパッケージ └RMySQL └RUnicode └RStudio └RSRuby └Rグラフ └回帰分析 └rでsvm
cantaloupe 2016/01/06
word2vec
リンク
From word2vec to doc2vec: an approach driven by Chinese restaurant process | Kifi Engineering Blog
From word2vec to doc2vec: an approach driven by Chinese restaurant process Posted on March 17, 2014 by Yingjie Miao. Google’s word2vec project has created lots of interests in the text mining community. It’s a neural network language model that is “both supervised and unsupervised”. Unsupervised in the sense that you only have to provide a big corpus, say English wiki. Supervised in the sense tha
cantaloupe 2016/01/06
[doc2vec]

word2vec
リンク
- 2016年1月15日
- 2016年1月6日
- 2015年12月18日