3.4. Metrics and scoring: quantifying the quality of predictions# 3.4.1. Which scoring function should I use?# Before we take a closer look into the details of the many scores and evaluation metrics, we want to give some guidance, inspired by statistical decision theory, on the choice of scoring functions for supervised learning, see [Gneiting2009]: Which scoring function should I use? Which scori
LinearDiscriminantAnalysis# class sklearn.discriminant_analysis.LinearDiscriminantAnalysis(solver='svd', shrinkage=None, priors=None, n_components=None, store_covariance=False, tol=0.0001, covariance_estimator=None)[source]# Linear Discriminant Analysis. A classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes’ rule. The model fits
1.16. Probability calibration# When performing classification you often want not only to predict the class label, but also obtain a probability of the respective label. This probability gives you some kind of confidence on the prediction. Some models can give you poor estimates of the class probabilities and some even do not support probability prediction (e.g., some instances of SGDClassifier). T
6. Strategies to scale computationally: bigger data¶ For some applications the amount of examples, features (or both) and/or the speed at which they need to be processed are challenging for traditional approaches. In these cases scikit-learn has a number of options you can consider to make your system scale. 6.1. Scaling with instances using out-of-core learning¶ Out-of-core (or “external memory”)
1.15. Isotonic regression# The class IsotonicRegression fits a non-decreasing real function to 1-dimensional data. It solves the following problem: subject to \(\hat{y}_i \le \hat{y}_j\) whenever \(X_i \le X_j\), where the weights \(w_i\) are strictly positive, and both X and y are arbitrary real quantities. The increasing parameter changes the constraint to \(\hat{y}_i \ge \hat{y}_j\) whenever \(
Note Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder Demo of DBSCAN clustering algorithm# DBSCAN (Density-Based Spatial Clustering of Applications with Noise) finds core samples in regions of high density and expands clusters from them. This algorithm is good for data which contains clusters of similar density. See the Comparing dif
Note Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder Feature transformations with ensembles of trees# Transform your features into a higher dimensional, sparse space. Then train a linear model on these features. First fit an ensemble of trees (totally random trees, a random forest, or gradient boosted trees) on the training set. The
NuSVR# class sklearn.svm.NuSVR(*, nu=0.5, C=1.0, kernel='rbf', degree=3, gamma='scale', coef0=0.0, shrinking=True, tol=0.001, cache_size=200, verbose=False, max_iter=-1)[source]# Nu Support Vector Regression. Similar to NuSVC, for regression, uses a parameter nu to control the number of support vectors. However, unlike NuSVC, where nu replaces C, here nu replaces the parameter epsilon of epsilon-S
1.11. Ensembles: Gradient boosting, random forests, bagging, voting, stacking# Ensemble methods combine the predictions of several base estimators built with a given learning algorithm in order to improve generalizability / robustness over a single estimator. Two very famous examples of ensemble methods are gradient-boosted trees and random forests. More generally, ensemble models can be applied t
Note Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder Gaussian Mixture Model Selection# This example shows that model selection can be performed with Gaussian Mixture Models (GMM) using information-theory criteria. Model selection concerns both the covariance type and the number of components in the model. In this case, both the Akai
An introduction to machine learning with scikit-learn# Machine learning: the problem setting# In general, a learning problem considers a set of n samples of data and then tries to predict properties of unknown data. If each sample is more than a single number and, for instance, a multi-dimensional entry (aka multivariate data), it is said to have several attributes or features. Learning problems f
mutual_info_score# sklearn.metrics.mutual_info_score(labels_true, labels_pred, *, contingency=None)[source]# Mutual Information between two clusterings. The Mutual Information is a measure of the similarity between two labels of the same data. Where \(|U_i|\) is the number of the samples in cluster \(U_i\) and \(|V_j|\) is the number of the samples in cluster \(V_j\), the Mutual Information betwee
feature_names_in_ndarray of shape (n_features_in_,)Names of features seen during fit. Defined only when X has feature names that are all strings. See also f_classifANOVA F-value between label/feature for classification tasks. mutual_info_classifMutual information for a discrete target. chi2Chi-squared stats of non-negative features for classification tasks. f_regressionF-value between label/featur
RandomTreesEmbedding# class sklearn.ensemble.RandomTreesEmbedding(n_estimators=100, *, max_depth=5, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_leaf_nodes=None, min_impurity_decrease=0.0, sparse_output=True, n_jobs=None, random_state=None, verbose=0, warm_start=False)[source]# An ensemble of totally random trees. An unsupervised transformation of a dataset to a high-
1.7. Gaussian Processes# Gaussian Processes (GP) are a nonparametric supervised learning method used to solve regression and probabilistic classification problems. The advantages of Gaussian processes are: The prediction interpolates the observations (at least for regular kernels). The prediction is probabilistic (Gaussian) so that one can compute empirical confidence intervals and decide based on
リリース、障害情報などのサービスのお知らせ
最新の人気エントリーの配信
処理を実行中です
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く