サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
体力トレーニング
archive.ics.uci.edu
Predict whether income exceeds $50K/yr based on census data. Also known as "Census Income" dataset.
The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be ('yes') or not ('no') subscribed. There are four datasets: 1) bank-additional-full.csv with all examples (41188) and 20 inputs, ordered
Welcome to the UC Irvine Machine Learning Repository We currently maintain 665 datasets as a service to the machine learning community. Here, you can donate and find datasets used by millions of people all around the world!
Welcome to the UC Irvine Machine Learning Repository We currently maintain 653 datasets as a service to the machine learning community. Here, you can donate and find datasets used by millions of people all around the world!
1. 2.4 GHZ Indoor Channel Measurements: Measurement of the S21,consists of 10 sweeps, each sweep contains 601 frequency points with spacing of 0.167MHz to cover a 100MHz band centered at 2.4GHz. 2. 3D Road Network (North Jutland, Denmark): 3D road network with highly accurate elevation information (+-20cm) from Denmark used in eco-routing and fuel/Co2-estimation routing algorithms. 3. AAAI 2013 Ac
All atrributes are numeric and continuous N. Attrib. 1 Q-E (input flow to plant) 2 ZN-E (input Zinc to plant) 3 PH-E (input pH to plant) 4 DBO-E (input Biological demand of oxygen to plant) 5 DQO-E (input chemical demand of oxygen to plant) 6 SS-E (input suspended solids to plant) 7 SSV-E (input volatile supended solids to plant) 8 SED-E (input sediments to plant) 9 COND-E (input conductivity to p
Number of Attributes: 61 (58 predictive attributes, 2 non-predictive, 1 goal field) Attribute Information: 0. url: URL of the article (non-predictive) 1. timedelta: Days between the article publication and the dataset acquisition (non-predictive) 2. n_tokens_title: Number of words in the title 3. n_tokens_content: Number of words in the content 4. n_unique_tokens: Rate of unique words in the conte
This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.
Multivariate, Sequential, Time-Series, Domain-Theory
This archive contains 2075259 measurements gathered in a house located in Sceaux (7km of Paris, France) between December 2006 and November 2010 (47 months). Notes: 1.(global_active_power*1000/60 - sub_metering_1 - sub_metering_2 - sub_metering_3) represents the active energy consumed every minute (in watt hour) in the household by electrical equipment not measured in sub-meterings 1, 2 and 3. 2.Th
dataset are derived from the customers’ reviews in Amazon Commerce Website for authorship identification. Most previous studies conducted the identification experiments for two to ten authors. But in the online context, reviews to be identified usually have more potential authors, and normally classification algorithms are not adapted to large number of target classes. To examine the robustness
These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines. I think that the initial data set had around 30 variables, but for some reason I only have the 13 dimensional version. I had a list of what the 30 or so variables
This is one of the earliest datasets used in the literature on classification methods and widely used in statistics and machine learning. The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are not linearly separable from each other. Predicted attribute: class of iris plant. This is an e
Parent Directory docword.enron.txt.gz docword.kos.txt.gz docword.nips.txt.gz docword.nytimes.txt.gz docword.pubmed.txt.gz readme.txt vocab.enron.txt vocab.kos.txt vocab.nips.txt vocab.nytimes.txt vocab.pubmed.txt Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips SVN/1.7.14 Phusion_Passenger/4.0.53 mod_perl/2.0.11 Perl/v5.16.3 Server at archive.ics.uci.edu Port 80
This data set includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family (pp. 500-525). Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one. The Guide clearly states that there is no simple rule for determining t
For each text collection, D is the number of documents, W is the number of words in the vocabulary, and N is the total number of words in the collection (below, NNZ is the number of nonzero counts in the bag-of-words). After tokenization and removal of stopwords, the vocabulary of unique words was truncated by only keeping words that occurred more than ten times. Individual document names (i.e. a
× Check out the beta version of the new UCI Machine Learning Repository we are currently testing! Contact us if you have any issues, questions, or concerns. Click here to try out the new site. I'm sorry, the dataset "Netflix Prize" does not appear to exist. A note from the donor regarding Netflix data: "Thank you for your interest in the Netflix Prize dataset. The dataset is no longer available."
Welcome to the UC Irvine Machine Learning Repository We currently maintain 668 datasets as a service to the machine learning community. Here, you can donate and find datasets used by millions of people all around the world!
このページを最初にブックマークしてみませんか?
『UCI Machine Learning Repository』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く