[B! python][machinelearning] tettsyunのブックマーク

tettsyun id:tettsyun

pythonとmachinelearningに関するtettsyunのブックマーク (5)

Python logistic regression (with L2 regularization) | This Number Crunching Life
Randomness in the world with a smattering of other randomness Logistic Regression Logistic regression is used for binary classification probl ems -- where you have some examples that are "on" and other examples that are "off." You get as input a training set which has some examples of each class along with a label saying whether each example is "on" or "off". The goal is to learn a model from the t
tettsyun 2010/01/20
ロジスティック回帰の L2正則化の実装

python

machinelearning
リンク
Não Aqui! » 10行強で書けるロジスティック回帰モデル学習
ロジスティック回帰（logistic regression）の学習が，確率的勾配降下法（SGD: stochastic gradient descent）を使って，非常に簡単に書けることを示すPythonコード．コメントや空行を除けば十数行です．リストの内包表記，条件演算子（Cで言う三項演算子），自動的に初期化してくれる辞書型（collections.defaultdict）は，Python以外ではあまり見ないかも知れません．リストの内包表記は，Haskell, OCaml, C#にもあるようなので，結構メジャーかも知れません． [W[x] for x in X] と書くと，「Xに含まれるすべてのxに対し，それぞれW[x]を計算した結果をリストにしたもの」という意味になります．sum関数はリストの値の和を返すので，変数aにはXとWの内積が計算されます． Pythonでは，三項演算子を条
tettsyun 2010/01/20
ロジスティック回帰の簡単な実装

python

machinelearning
リンク
Latent Semantic Analysis in Python - Joseph Wilk
Latent Semantic Analysis (LSA) is a mathematical method that tries to bring out latent relationships within a collection of documents. Rather than looking at each document isolated from the others it looks at all the documents as a whole and the terms within them to identify relationships. An example of LSA: Using a search engine search for “sand”. Documents are returned which do not contain the s
tettsyun 2009/11/05
lsa

python

machinelearning
リンク
SciPyを用いて潜在的意味解析(LSA) - 未来は僕以外の手の中
自然言語処理の技法の１つに、潜在的意味解析(LSA)というものがある。単語文書行列Ａがあった場合、特異値分解(SVD)によりＡ=ＵΣＶに分解し、特異値を大きいほうからk個使ってＡk=ＵkΣkＶk のように階数の低減を行うことで、階数kのＡへの近似を最小誤差で得ることができる。つまり特異値分解の計算さえできてしまえばLSAもすぐできるわけだが、 pythonの数値解析モジュールScipyにかかれば特異値分解もあっという間である。まずは特異値分解まで↓ from numpy import * from scipy import linalg A = matrix([ [5, 8, 9, -4, 2, 4], [2, -4, 9, 4, 3, 3], [-3, 4, 8, 0, 5, 6], [-2, 5, 4, 7, 0, 2] ]) u, sigma, v = linalg.sv
tettsyun 2009/10/30
scipy, LSA(LSI)

python

machinelearning

algorithm
リンク
C++: マルチコアCPUを利用した並列化による高速な階層的クラスタリング
比較的データ数の多い階層的クラスタリングを行う必要があり、手元にあるRで処理を始めたのだが思ったよりも遅かった。そこで、マルチコアCPUを利用した並列化で高速化することにした。RでもGPGPUを使って高速化したプログラムがあるようなのだが、すぐに使える高性能GPUを用意できなかったし、それに、TBBライブラリを使った並列化は手間も時間も掛からないので作ってしまった方が良いと判断した。尚、このプログラムで作成したクラスタデータのデンドログラム描画や閾値による区分けについては一つの記事で書くには大きすぎるので、記事を分けて、「Python: 階層的クラスタリングのデンドログラム描画と閾値による区分け」に回すことにする。まずは、「集合知プログラミング」にPythonによる階層的クラスタリングのソースコードが載っていたのでそれをC++で書き直した。アルゴリズム自体はそれほど複雑なものではないの
tettsyun 2009/10/10
clustering tbb

c++

python

machinelearning
リンク
1