Cross Validated[B!]新着記事・評価 - はてなブックマーク

What is the meaning of the axes in t-SNE?
3 users
stats.stackexchange.com

I'm currently trying to wrap my head around the t-SNE math. Unfortunately, there is still one question I can't answer satisfactorily: What is the actual meaning of the axes in a t-SNE graph? If I were to give a presentation on this topic or include it in any publication: How would I label the axes appropriately? P.S: I read this Reddit question but the answers given there (such as "it depends on i
- 暮らし
- 2020/05/02 15:19

What is the advantages of Wasserstein metric compared to Kullback-Leibler divergence?
3 users
stats.stackexchange.com

What is the practical difference between Wasserstein metric and Kullback-Leibler divergence? Wasserstein metric is also referred to as Earth mover's distance. From Wikipedia: Wasserstein (or Vaserstein) metric is a distance function defined between probability distributions on a given metric space M. and Kullback–Leibler divergence is a measure of how one probability distribution diverges from a s
- 世の中
- 2019/05/16 15:59
Basic question about Fisher Information matrix and relationship to Hessian and standard errors
3 users
stats.stackexchange.com

Ok, this is a quite basic question, but I am a little bit confused. In my thesis I write: The standard errors can be found by calculating the inverse of the square root of the diagonal elements of the (observed) Fisher Information matrix: \begin{align*} s_{\hat{\mu},\hat{\sigma}^2}=\frac{1}{\sqrt{\mathbf{I}(\hat{\mu},\hat{\sigma}^2)}} \end{align*} Since the optimization command in R minimizes $-\l
- テクノロジー
- 2017/06/06 20:43
- あとで読む
How to intuitively explain what a kernel is?
3 users
stats.stackexchange.com

Kernel is a way of computing the dot product of two vectors $\mathbf x$ and $\mathbf y$ in some (possibly very high dimensional) feature space, which is why kernel functions are sometimes called "generalized dot product". Suppose we have a mapping $\varphi \, : \, \mathbb R^n \to \mathbb R^m$ that brings our vectors in $\mathbb R^n$ to some feature space $\mathbb R^m$. Then the dot product of $\ma
- 学び
- 2017/05/22 23:08
Importance of local response normalization in CNN
4 users
stats.stackexchange.com

I've found that Imagenet and other large CNN makes use of local response normalization layers. However, I cannot find that much information about them. How important are they and when should they be used? From http://caffe.berkeleyvision.org/tutorial/layers.html#data-layers: "The local response normalization layer performs a kind of “lateral inhibition” by normalizing over local input regions. In
- テクノロジー
- 2017/02/23 10:08
What is the difference between zero-inflated and hurdle models?
3 users
stats.stackexchange.com

Thank you for the interesting question! Difference: One limitation of standard count models is that the zeros and the nonzeros (positives) are assumed to come from the same data-generating process. With hurdle models, these two processes are not constrained to be the same. The basic idea is that a Bernoulli probability governs the binary outcome of whether a count variate has a zero or positive re
- 世の中
- 2016/12/01 13:24
What is the best introductory Bayesian statistics textbook?
3 users
stats.stackexchange.com

Stack Exchange Network Stack Exchange network consists of 183 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers. Visit Stack Exchange
- テクノロジー
- 2016/10/27 21:59
- books
Making sense of principal component analysis, eigenvectors & eigenvalues
3 users
stats.stackexchange.com

In today's pattern recognition class my professor talked about PCA, eigenvectors and eigenvalues. I understood the mathematics of it. If I'm asked to find eigenvalues etc. I'll do it correctly like a machine. But I didn't understand it. I didn't get the purpose of it. I didn't get the feel of it. I strongly believe in the following quote: You do not really understand something unless you can expla
- テクノロジー
- 2016/10/08 15:49
Relationship between SVD and PCA. How to use SVD to perform PCA?
6 users
stats.stackexchange.com

Principal component analysis (PCA) is usually explained via an eigen-decomposition of the covariance matrix. However, it can also be performed via singular value decomposition (SVD) of the data matrix $\mathbf X$. How does it work? What is the connection between these two approaches? What is the relationship between SVD and PCA? Or in other words, how to use SVD of the data matrix to perform dimen
- テクノロジー
- 2016/10/08 13:05
What are good initial weights in a neural network?
3 users
stats.stackexchange.com

I have just heard, that it's a good idea to choose initial weights of a neural network from the range $(\frac{-1}{\sqrt d} , \frac{1}{\sqrt d})$, where $d$ is the number of inputs to a given neuron. It is assumed, that the sets are normalized - mean 0, variance 1 (don't know if this matters). Why is this a good idea?
- 暮らし
- 2016/06/27 17:30
Bagging, boosting and stacking in machine learning
3 users
stats.stackexchange.com

What's the similarities and differences between these 3 methods: Bagging, Boosting, Stacking? Which is the best one? And why? Can you give me an example for each?
- テクノロジー
- 2016/05/05 18:19
- あとで読む
Algorithms for automatic model selection
3 users
stats.stackexchange.com

I would like to implement an algorithm for automatic model selection. I am thinking of doing stepwise regression but anything will do (it has to be based on linear regressions though). My problem is that I am unable to find a methodology, or an open source implementation (I am woking in java). The methodology I have in mind would be something like: calculate the correlation matrix of all the facto
- テクノロジー
- 2015/11/12 15:13
Neural networks vs support vector machines: are the second definitely superior?
3 users
stats.stackexchange.com

Many authors of papers I read affirm SVMs is superior technique to face their regression/classification problem, aware that they couldn't get similar results through NNs. Often the comparison states that SVMs, instead of NNs, Have a strong founding theory Reach the global optimum due to quadratic programming Have no issue for choosing a proper number of parameters Are less prone to overfitting Nee
- テクノロジー
- 2015/10/19 20:44
- 機械学習
- あとで読む
What is the difference between "likelihood" and "probability"?
3 users
stats.stackexchange.com

The wikipedia page claims that likelihood and probability are distinct concepts. In non-technical parlance, "likelihood" is usually a synonym for "probability," but in statistical usage there is a clear distinction in perspective: the number that is the probability of some observed outcomes given a set of parameter values is regarded as the likelihood of the set of parameter values given the obser
- 暮らし
- 2015/07/30 00:21
How to calculate perplexity of a holdout with Latent Dirichlet Allocation?
3 users
stats.stackexchange.com

I'm confused about how to calculate the perplexity of a holdout sample when doing Latent Dirichlet Allocation (LDA). The papers on the topic breeze over it, making me think I'm missing something obvious... Perplexity is seen as a good measure of performance for LDA. The idea is that you keep a holdout sample, train your LDA on the rest of the data, then calculate the perplexity of the holdout. The
- 暮らし
- 2015/06/16 17:31
What is the difference between convolutional neural networks, restricted Boltzmann machines, and auto-encoders?
3 users
stats.stackexchange.com

Autoencoder is a simple 3-layer neural network where output units are directly connected back to input units. E.g. in a network like this: output[i] has edge back to input[i] for every i. Typically, number of hidden units is much less then number of visible (input/output) ones. As a result, when you pass data through such a network, it first compresses (encodes) input vector to "fit" in a smaller
- テクノロジー
- 2015/03/26 16:55
- Deep Learning
- あとで読む
What is the difference between ZCA whitening and PCA whitening?
3 users
stats.stackexchange.com

Let your (centered) data be stored in a $n\times d$ matrix $\mathbf X$ with $d$ features (variables) in columns and $n$ data points in rows. Let the covariance matrix $\mathbf C=\mathbf X^\top \mathbf X/n$ have eigenvectors in columns of $\mathbf E$ and eigenvalues on the diagonal of $\mathbf D$, so that $\mathbf C = \mathbf E \mathbf D \mathbf E^\top$. Then what you call "normal" PCA whitening tr
- テクノロジー
- 2015/02/15 01:42
- 機械学習
Data APIs/feeds available as packages in R
5 users
stats.stackexchange.com

EDIT: The Web Technologies and Services CRAN task view contains a much more comprehensive list of data sources and APIs available in R. You can submit a pull request on github if you wish to add a package to the task view. I'm making a list of the various data feeds that are already hooked into R or that are easy to setup. Here's my initial list of packages, and I was wondering what else I'm missi
- テクノロジー
- 2014/07/20 14:54
- R
- data
- dataset
- データ
- api
What is the difference between test set and validation set?
12 users
stats.stackexchange.com

I found this confusing when I use the neural network toolbox in Matlab. It divided the raw data set into three parts: training set validation set test set I notice in many training or learning algorithm, the data is often divided into 2 parts, the training set and the test set. My questions are: what is the difference between validation set and test set? Is the validation set really specific to ne
- テクノロジー
- 2013/10/25 17:54
- 機械学習
libsvm "reaching max number of iterations" warning and cross-validation
3 users
stats.stackexchange.com

I'm using libsvm in C-SVC mode with a polynomial kernel of degree 2 and I'm required to train multiple SVMs. Each training set has 10 features and 5000 vectors. During training, I am getting this warning for most of the SVMs that I train: WARNING: reaching max number of iterations optimization finished, #iter = 10000000 Could someone please explain what does this warning implies and, perhaps, how
- 学び
- 2013/03/16 04:12
Software for drawing bayesian networks (graphical models)
5 users
stats.stackexchange.com

I am searching for [free] software that can produce nice looking graphical models, e.g. Any suggestions would be appreciated.
- 暮らし
- 2012/09/04 10:45
- あとで読む
KL divergence between two univariate Gaussians
4 users
stats.stackexchange.com

I need to determine the KL-divergence between two Gaussians. I am comparing my results to these, but I can't reproduce their result. My result is obviously wrong, because the KL is not 0 for KL(p, p). I wonder where I am doing a mistake and ask if anyone can spot it. Let $p(x) = N(\mu_1, \sigma_1)$ and $q(x) = N(\mu_2, \sigma_2)$. From Bishop's PRML I know that $$KL(p, q) = - \int p(x) \log q(x) d
- テクノロジー
- 2012/06/15 22:27
- 数学
- あとで読む
ROC vs precision-and-recall curves
3 users
stats.stackexchange.com

I understand the formal differences between them, what I want to know is when it is more relevant to use one vs. the other. Do they always provide complementary insight about the performance of a given classification/detection system? When is it reasonable to provide them both, say, in a paper? instead of just one? Are there any alternative (maybe more modern) descriptors that capture the relevant
- 学び
- 2012/06/02 04:32
When should I use lasso vs ridge?
3 users
stats.stackexchange.com

Say I want to estimate a large number of parameters, and I want to penalize some of them because I believe they should have little effect compared to the others. How do I decide what penalization scheme to use? When is ridge regression more appropriate? When should I use lasso?
- 世の中
- 2012/05/20 01:51
What is your favorite "data analysis" cartoon?
3 users
stats.stackexchange.com

Data analysis cartoons can be useful for many reasons: they help communicate; they show that quantitative people have a sense of humor too; they can instigate good teaching moments; and they can help us remember important principles and lessons. This is one of my favorites: As a service to those who value this kind of resource, please share your favorite data analysis cartoon. They probably don't
- 学び
- 2012/03/26 01:04
Python as a statistics workbench
5 users
stats.stackexchange.com

Lots of people use a main tool like Excel or another spreadsheet, SPSS, Stata, or R for their statistics needs. They might turn to some specific package for very special needs, but a lot of things can be done with a simple spreadsheet or a general stats package or stats programming environment. I've always liked Python as a programming language, and for simple needs, it's easy to write a short pro
- テクノロジー
- 2011/10/13 06:09
Cross Validated
37 users
stats.stackexchange.com

Stack Exchange Network Stack Exchange network consists of 183 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers. Visit Stack Exchange
- テクノロジー
- 2010/08/23 05:31
- statistics
- 統計
- stats
- R
- 機械学習

はてなブックマーク

はてなブックマーク

『Cross Validated』

What is the meaning of the axes in t-SNE?

Basic question about Fisher Information matrix and relationship to Hessian and standard errors

How to intuitively explain what a kernel is?

Importance of local response normalization in CNN

What is the best introductory Bayesian statistics textbook?

Making sense of principal component analysis, eigenvectors & eigenvalues

Relationship between SVD and PCA. How to use SVD to perform PCA?

What are good initial weights in a neural network?

Bagging, boosting and stacking in machine learning

Algorithms for automatic model selection

Neural networks vs support vector machines: are the second definitely superior?

What is the difference between "likelihood" and "probability"?

How to calculate perplexity of a holdout with Latent Dirichlet Allocation?

What is the difference between convolutional neural networks, restricted Boltzmann machines, and auto-encoders?

What is the difference between ZCA whitening and PCA whitening?

Data APIs/feeds available as packages in R

What is the difference between test set and validation set?

libsvm "reaching max number of iterations" warning and cross-validation

Software for drawing bayesian networks (graphical models)

KL divergence between two univariate Gaussians

ROC vs precision-and-recall curves

What is your favorite "data analysis" cartoon?

Python as a statistics workbench

Cross Validated

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス

『Cross Validated』

このページはまだブックマークされていません

キーボードショートカット一覧

公式Twitter

はてなのサービス

このページはまだ
ブックマークされていません