[B! statistics] [5ページ] ichanのブックマーク

ichan id:ichan

statisticsに関するichanのブックマーク (369)

言語モデルのよさをはかる指標としてのパープレキシティ - yasuhisa's blog
今日の自然言語処理特論の講義がとても面白かったのでメモ。今日の自然言語処理特論では、最初に言語モデルとしてn-gramのモデルの紹介があって、それの最尤推定の話。次に言語モデルのよさをはかる尺度としてエントロピー*1を元にしたパープレキシティが紹介された。パープレキシティの導出のところはこんな感じ。まず情報量やエントロピーの導出のところの話があって、(頻度論的に)真のモデルがあると仮定したときにそのモデルと自分のモデルの仮想的な距離を相対エントロピー(=KLダイバージェンス)ではかろうとする。しかし、真のモデルの分布なんぞは分からないので困るのだが、Shannon-McMillan-Breimanの定理という素晴らしい定理(言語と計算 (4) 確率的言語モデルに説明が載っているらしい)があるらしく、これを利用すると真のモデルとの相互エントロピーを計算できる!!これはすごい。まあ、これに
ichan 2009/09/29
Shannon-McMillan-Breimanの定理

Statistics

Machine Learning
リンク
The Bayesian Songbook
The Bayesian Singalong Book Abstract Recently updated for Valencia 9 (June 2010), this is a collection of some of the most well-known Bayesian Songbook classics, dating back to the very earliest days of the Valencia meeting cabarets. Like many "greatest hits" collections, it of course omits a lot of great material in order to make room for better-known "hits," some of which seem to smack of commer
ichan 2009/09/09
ベイジアンの替え歌集、俺のベイズ!ベイズ!ベイズ!も加えてもらいたい

Bayes

Statistics
リンク
「本の現場」はスゴ本
出版関係者は必読、本好きな方も。「本はどのように生み出されいているのか？」「本はどのように読まれているのか？」というテーマで連載していた記事をまとめ+補記したもの。たくさんの気づきと、手がかりと、新しいヒントが得られたスゴ本。ヒントは追々このblogで実験していこうかと。 ■　ホントに「本」は読まれなくなったのか？そんな疑問を、ずっと抱いている。たしかに通痛電車でケータイ（端末・ゲーム）を弄っている人は増えたけど、文庫を広げている人もいるわけで、それだけでケータイが本を駆逐している！と煽られてもなぁ。図書館も盛況だし、子どもが通う学校も「朝の読書」にえらくチカラこぶ入れているようだ。そして、「本が売れない」というのも実感がわかない。いきつけの書店はいつもごったがえしており、レジに並ぶのがイヤでついAmazonを利用してしまう。村上某の新刊山に「お1人さま2冊限り」のタレ幕が下がってた
ichan 2009/09/02
Statistics
リンク
R Graph Gallery :: Home
Welcome! The R Graph Gallery aims to present several different graphics fully created with the programming environment R [http://www.r-project.org]. Graphs are gathered in a MySQL database and browsable thanks to PHP. We hope that this gallery will provide many benefits, including: Discover new graphics that are suited to specific situations Highlight the poweRful graphical abilities of R Sh
ichan 2009/08/31
Visualization

R

Statistics
リンク
Darren Wilkinson - SMfSB
ichan 2009/08/31
SMfSB

Systems Biology

Statistics

SBML
リンク
MCMC再訪(1): Taglibro de H
秋の某統計高座のテキスト作成のため、もう一度MCMCを最初からやってみる。例題1: 正規分布することがわかっているある母集団から、X = (5.89, 5.69, 4.71, 6.73, 4.90, 3.34, 4.60, 4.00) という標本が得られたとき、その母集団の平均と標準偏差をベイズ推定する。コード ## ## Sample 1 ## library(R2WinBUGS) ## data X <- c(5.89, 5.69, 4.71, 6.73, 4.90, 3.34, 4.60, 4.00) ## model model <- function() { for (i in 1:N) { X[i] ~ dnorm(mu, tau); } ## priors mu ~ dnorm(0.0, 1.0E-6); tau ~ dgamma(1.0E-3, 1.0E-3); si
ichan 2009/08/27
R

Statistics

Bayes
リンク
R You Ready for R? | wrong, rogue and log
2, 3日前のNY Timesに統計環境/言語 R [http://www.r-project.org/]のことが取り上げられていた。フォローアップのブログ記事まで書かれる破格の待遇である。 Data Analysts Captivated by R’s Power http://www.nytimes.com/2009/01/07/techno logy/business-computing/07program.html?_r=1 R You Ready for R? http://bits.blogs.nytimes.com/2009/01/08/r-you-ready-for-r/ 記事の論調はRに非常に好意的で、まさにOpen Sourceプロジェクトの大成功例の一つとして、またOpen Sourceプロジェクトだからこそ、個々のライブラリがユーザの多様で複雑な要望に上手く対応するこ
ichan 2009/08/11
R

Statistics

Program
リンク
Stephen Marsland
This webpage contains the code and other supporting material for the textbook "Machine Learning: An Algorithmic Perspective" by Stephen Marsland, published by CRC Press, part of the Taylor and Francis group. The first edition was published in 2009, and a revised and updated second edition is due out towards the end of 2014. The book is aimed at computer science and engineering undergraduates studi
ichan 2009/08/10
Statistics

Python

Machine Learning

あとで試す
リンク
2009 UCSD/FICO Data Mining Contest とか - Standard ML of Yukkuri
http://mill.ucsd.edu/index.php?page=Results学部生からポスドクまでを対象としたUCSD主催のデータマイニングコンテストに参加してました. チーム名は smly で一人チーム. NAIST からは他にも論理生命学講座の先輩方のチーム west が参加していて, 彼等がひとつのタスクで優勝したようです. おめでとうございます. 二ヶ月ほどの期間で競われるコンテストなのですが, いろいろあって実質的な参加期間は二週間程度で, 入賞もできなかったという微妙な結果でした. たしかこの頃は研究会やYANSの発表ネタをがんばっていた気がする. でも来年は入賞して賞金をとりに行きます. 賞金が欲しい.タスクは E-commerce のトランザクションデータが与えられ, その中に含まれる異常なトランザクション(正例)を分類するという単純な二値分類で, easy と
ichan 2009/08/08
Machine Learning

Statistics

Meeting
リンク
Toy box full of toys
ichan 2009/08/06
反応した人をまとめてみた。あらためてお悔やみ申し上げます。統計系のひとをみつけるのにも使えるかもしれない。

Statistics

Machine Learning
リンク
Toy box full of toys
赤池先生追悼ポストまとめ背景: @bonohu のポストで赤池先生の訃報を知る。 dritoshi: え、赤池先生!!! [http://twitter.com/dritoshi/status/3142569480] dritoshi: AICと赤池先生をbuzzらせる会発足 [http://twitter.com/dritoshi/status/3142586000] dritoshi: たしかこれが赤池先生によるAICの最初の論文。かつてAICは An information criterion だった http://www.garfield.library... [http://twitter.com/dritoshi/status/3142653202] 追悼と復習を兼ねてAICを解説し出す俺。誰も頼んでないけどw dritoshi: A
ichan 2009/08/06
セルクマ。AICがらみの俺ぽすとだけあつめた

Statistics

Machine Learning

itoshi
リンク
UCA WORKS
2019年に作曲したBGMとしての利用用の曲になります。製作期間は9月上旬～下旬までの1か月間でした。明るい未来と誠実さをイメージさせるような映像のバックミュージックとして作曲しました。リファレンス曲は、Youtube…
ichan 2009/07/22
Programming

Statistics
リンク
『適切なクラスタ数を推定するX-means法 - kaisehのブログ』へのコメント
ブックマークしましたここにツイート内容が記載されます https://b.hatena.ne.jp/URLはspanで囲んでください Twitterで共有
ichan 2009/07/10
あとで試す

Statistics
リンク
軽量データクラスタリングツールbayon - mixi engineer blog
逆転検事を先日クリアして、久しぶりに逆転裁判1〜3をやり直そうか迷い中のfujisawaです。シンプルなデータクラスタリングツールを作成しましたので、そのご紹介をさせていただきます。クラスタリングとはクラスタリングとは、対象のデータ集合中で似ているもの同士をまとめて、いくつかのグループにデータ集合を分割することです。データマイニングや統計分析などでよく利用され、データ集合の傾向を調べたいときなどに役に立ちます。例えば下図の例ですと、当初はデータがゴチャゴチャと混ざっていてよく分からなかったのですが、クラスタリングすることで、実際は3つのグループのデータのみから構成されていることが分かります。様々なクラスタリング手法がこれまでに提案されていますが、有名なところではK-means法などが挙げられます。ここでは詳細については触れませんが、クラスタリングについてより詳しく知りたい方は以下の
ichan 2009/06/11
あとで試す

Statistics

Program
リンク
Bayesian linear regression – Mailund on the Internet
Mailund on the Internet Computer science, bioinformatics, genetics, and everything in between I’m currently reading Bayesian Statistics: An introduction by Peter M. Lee, and today I’m reading chapter 6 on linear regression. I decided to play around with one of the examples in R. The model is just linear regression as we know and love it, that is we have explanatory variables $$x_i$$ and dependent
ichan 2009/05/25
R

あとで試す

Bayes

Statistics
リンク
ヒストグラムと密度の推定 - RjpWiki
密度 f(x) = 0.6φ(x)+0.4ψ(x) † φを平均-1，分散1の正規分布に従う確率変数，ψを平均2，分散1の正規分布に従う確率変数として f(x) = 0.6φ(x)+0.4ψ(x) となる密度関数 truedensity() を定義する． truedensity <- function (x) { 0.6/sqrt(2*pi)*exp(-(x+1)^2/2) + 0.4/sqrt(2*pi)*exp(-(x-2)^2/2) } > curve(truedensity, xlim=c(-6,6), ylim=c(0,0.3), col=2) f(x) に従う乱数を生成する関数 † 次に， f(x) に従う乱数を生成する関数を定義する． generator_tmp <- function(n) { data1 <- rnorm(n)-1 # φ(x) に従う乱数 data2 <
ichan 2009/05/20
Statistics

Histogram
リンク
Data-Based Choice of Histogram Bin Width on JSTOR
ichan 2009/05/20
Wandのプラグイン法の元論文

Statistics

Histogram
リンク
Histogram - Wikipedia
For the histogram used in digital image processing, see Image histogram and Color histogram. A histogram is a visual representation of the distribution of quantitative data. The term was first introduced by Karl Pearson.[1] To construct a histogram, the first step is to "bin" (or "bucket") the range of values— divide the entire range of values into a series of intervals—and then count how many val
ichan 2009/05/20
Statistics

Histogram
リンク
R -- AIC による，ヒストグラム（度数分布表）の最適階級分割の探索
AIC による，ヒストグラム（度数分布表）の最適階級分割の探索　　　　　Last modified: Jun 13, 2007 目的 AIC により，最適な度数分布表となる階級分けを探索する。使用法 AIC.Histogram(x, d = 0, c = floor(2*sqrt(length(x))-1)) 引数 x データベクトル d 測定精度（無限の精度の場合には 0 c 初期階級数ソースインストールは，以下の 1 行をコピーし，R コンソールにペーストする source("http://aoki2.si.gunma-u.ac.jp/R/src/AIC-Histogram.R", encoding="euc-jp") # AIC による，ヒストグラム（度数分布表）の最適階級分割の探索 # AIC.Histogram <- function( x, # データベクトル d = 0
ichan 2009/05/20
Statistics

Histogram
リンク
PLOS Biology: A Peer-Reviewed Open-Access Journal
Kenneth Dyar, Henriette Uhlenhaut and colleagues compare in vivo genomic binding patterns of transcription factors BMAL1 and REV-ERB with 24-hour transcriptional and metabolic changes upon their loss to reveal how the circadian clock directs the diurnal rhythms of muscle metabolism.
ichan 2009/04/23
Bio

Mathematics

Physics

Statistics

Theoretical biology

qbio
リンク
前のページ 1 2 3 4 5 6 7 8 9 10 次のページ