[B! データ] rishidaのブックマーク

arXivTimes/datasets at master · arXivTimes/arXivTimes

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

rishida 2017/09/08

データ

リンク

fastText/pretrained-vectors.md at master · facebookresearch/fastText · GitHub

Pre-trained word vectors We are publishing pre-trained word vectors for 294 languages, trained on Wikipedia using fastText. These vectors in dimension 300 were obtained using the skip-gram model described in Bojanowski et al. (2016) with default parameters. Format The word vectors come in both the binary and text default formats of fastText. In the text format, each line contain a word followed by

rishida 2017/04/11

データ

リンク

GitHub - Kyubyong/wordvectors: Pre-trained word vectors of 30+ languages

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

rishida 2017/04/11

データ

リンク

ggtree paper published

Today ggtree received 100 stars on and I found the paper was online at the same day by the tweet: I am quite exciting about it. Now ggtree is citable by doi:10.1111/2041-210X.12628. In the review period, ggtree had already been cited by several papers: 1. https://doi.org/10.7717/peerj.976 Annotating phylogenetic tree with image files (i.e. secondary structure). 2. http://dx.doi.org/10.1016/j.meegi

rishida 2016/08/29

データ

リンク

機械学習に使える、オープンデータ一覧　※随時更新 - Beginning AI

機械学習をやりたいんだけど、データがない！他のデータ使ってみたい！そんな方のために、機械学習に使えるオープンデータを集めました。他にも、このデータセットオススメ！というものがあれば、是非ご紹介して頂けると嬉しいです。m(__)m UC Irvine Machine Learning Repository カリフォルニア大学アーバイン校が公開した、データセット。351件のデータセットがあり後述する DATA GO に比べれば少ないが、ほとんどがMachine Learning用のデータ・セットなので、かなりオススメ。 UCI Machine Learning Repository かの有名なあやめの花（iris）のデータセットもここから見ることができます。国立情報学研究所情報学研究データリポジトリデータセット一覧 yahoo,楽天,ニコニコなどのデータがあります。 DATA.GO.

rishida 2016/08/29

データ

リンク

VisualGenome

Visual Genome is a dataset, a knowledge base, an ongoing effort to connect structured image concepts to language.

rishida 2016/08/26

データ

リンク

自然言語/音声認識学習用データのまとめ - Qiita

自然言語処理、また音声認識を学習するためのデータは各種の研究機関などから提供されています。ここでは、それらのデータのありかをまとめておきます。他にもあるぞ、という情報がありましたらぜひお寄せください。高度言語情報融合フォーラム(有料) 多様な言語資源、音声資源、ソフトウェアツールを提供してくれている団体です。ただし、ダウンロードには会員登録が必要です(入会金10万円だが、年会費などはなし)。データセットはこちらから参照できます。研究機関限定ですが、楽天のデータなどもあります。 ALAGIN 言語資源・音声資源サイト資源それと、毎年開催されている音声認識・音声対話技術講習会に優先枠で申し込むことができます。この講習会は音声対話を行おうとしている人なら一回入っておくといい講習会なので、こちらもおすすめです。コーパス開発センター(有料) その名が正に体を表すサイト。書き言葉、話し言葉

rishida 2016/07/13

データ

リンク

United Nations Parallel Corpus

United Nations, Department for General Assem bly and Conference Management United Nations Parallel Corpus Introduction The United Nations Parallel Corpus v1.0 is composed of official records and other parliamentary documents of the United Nations that are in the public domain. These documents are mostly available in the six official languages of the United Nations. The current version of the corpus

rishida 2016/06/06

“The United Nations Parallel Corpus v1.0 is composed of official records and other parliamentary documents of the United Nations that are in the public domain”

データ

リンク

Welcome to The Cancer Imaging Archive - The Cancer Imaging Archive (TCIA)

Welcome to The Cancer Imaging ArchiveThe Cancer Imaging Archive (TCIA) is a service which de-identifies and hosts a large archive of medical images of cancer accessible for public download.

rishida 2016/04/06

データ

リンク

SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining

SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining Stefano Baccianella, Andrea Esuli, Fabrizio Sebastiani firstname.lastname@isti.cnr.it Istituto di Scienza e Tecnologie dell’Informazione Consiglio Nazionale delle Ricerche Pisa, Italy LREC 2010, Malta, May 17–23, 2010 Stefano Baccianella, Andrea Esuli, Fabrizio Sebastiani (ISTI-CNR) SentiWordNet 3.0 LREC 2010

rishida 2016/04/06

データ

リンク

wikipediaのデータベースから類義語辞書を作成しよう＋ちょっと可視化 - 蟻の実験工房（別館ラボ）

今回はwikipediaのデータベースから類義語辞書を作成しよう！です。思いついたきっかけは「WikipediaからのSolr用類義語辞書の自動生成」 http://www.slideshare.net/KojiSekiguchi/wikipediasolr?from=new_upload_em ail というスライドを見て、ひょっとしたらwikipediaのデータベースにある「#redirect(・・から転送）」の関係は類義語と言えるので、このデータを利用すれば簡単に類義語の辞書が出来るかもしれない。とふと思いついたのが始まりです。で、実際にwikipediaのデータをmysqlでデータベース化したものから#redirectの関係で類義語辞書をつくってみたのがこちらのファイル http://wordword.antlabo.jp/data/dougo golist.csv.gz (w

rishida 2016/01/26

データ

リンク

英辞郎（えいじろう・EIJIRO）の最新情報

英辞郎とはサンプルデータ利用規約 iPhone / iPad / iPod touch用アプリｉ英辞郎 Handy英辞郎単語帳アプリ『スライド単語』 Android用アプリ Eijiroid Handy英辞郎デスクトップ用アプリ英辞郎 for macOS Dictionary.app Mac用『まるごと英和検索 for 英辞郎』の専用データ RapidDict 英辞郎のテキストデータその他のデータ（無料）子辞郎考辞郎笑辞郎旅辞郎順辞郎 (英単語の頻度順位） 2,763,236 昨日：383 本日：267

rishida 2016/01/26

英和和英の辞書。一部が利用可？

データ

リンク

イースト株式会社

弊社はC#言語の得意なエンジニアが多いということもありWindows テクノロジーをベースとしたシステム開発を多く請け負ってきました。近年ではAzureを利用したシステム構築に関する問い合わせも増えており、Visual Studio、.NET、WindowsServer、Active Directory、SQL ServerなどAzureと親和性の高いこれらを活かしたご提案も可能となっております。新規開発案件以外にもレガシーASP、VB.NET、Adobe Flashなど旧システムからのマイグレーションや改修のご相談、ご依頼もお待ちしております。 .NET Framework（ドットネットフレームワーク）とは Windowsでアプリケーションをビルドして実行するためのソフトウェア開発フレームワークです。Windows、iOS、Android などのアプリを構築するための.NET プラットフ

rishida 2016/01/26

英和和英の辞書のAPI

リンク

The EDICT Dictionary File

The EDICT Dictionary File Welcome to the Home Page of the EDICT file within the JMdict/EDICT Project. This page has been written by Jim Breen (hereafter "I" or "me") and is intended as an overview of the file, with links to more detail elsewhere. Background Way back in 1991 I began to experiment with handling Japanese text in computer files, and decided to try writing a dictionary search program i

rishida 2016/01/26

和英辞書のデータ

データ

リンク

日本語対訳データ

これは、日本語を対象とする機械翻訳システムの構築に利用できる言語資源のリストです。主に日英翻訳の資源を取り上げていますが、最後の方に多言語に対応したコーパスもいくつか取り上げています。もしこのリストに載っていないものがあれば、遠慮なく教えてください！また、日本語を含まない言語対のリストはほかのサイトでたくさんあります： 1 2 3。日英対訳コーパス以下の資源は、対訳文からなるコーパスで、統計的機械翻訳システムの学習に利用できます。各項目は名前、リンク、文数、説明、研究・商用利用の可能性とおおよその金額などが入っています。主に10万文以上からなるコーパスを中心にリストアップしていますが、小さいものも一部載せています。名前文数研究用商用説明

rishida 2015/12/07

リンク

URLを入力するだけ！コンテンツをスクレイピングしてデータ化してくれる無料ツール「import.io」

import.ioとは import.ioは、データ化したいページのURLを入力するだけで、自動でデータ箇所を判断して情報を集めてくれるスクレイピングサービスです。無料で利用することができ、セットアップも、データ収集用のトレーニングなども必要ありません。 URLを入力して、ボタンを押すだけという簡単さから、誰にでも利用できるデータ収集ツールだと思います。以下では、その簡単な使い方や、利用例などを紹介したいと思います。定期的なサイトへのスクレイピングは相手サイトの負荷になるので、一日に何度も何度も同一サイトに使用するのはやめましょう。加えて、取得したデータを、そのまま何かに利用すると著作権違反になる恐れもあります。基本的な使い方 import.ioの最大の特徴は、使い方の簡単さです。以下では、その使い方の例として、IKEAのソファー検索結果ページのデータを取得してみたいと思います。

rishida 2015/09/11

リンク

tscorpus/blog_KNB_001_Keitai_semantics.txt at master · ajb129/tscorpus · GitHub

rishida 2015/09/11

日本語の述語論理のツリーバンク

リンク

世界のオープンデータを集約した、画期的なプラットフォーム「Plenario」 | Techable(テッカブル)

日本政府が開設したデータカタログサイト「DATA GOJP」や米国連邦政府の「Data.gov」に代表されるとおり、近年、中央政府や地方公共団体などの公共データが、オープンデータとして数多く公開されている。しかしながら、管轄機関ごとにデータが分散しているうえ、ファイルの形式や構造が統一されていないのが現状。複数のデータを統合したり、解析するには、高度な専門スキルを要し、膨大な手間がかかるのが課題となっている。・世界のオープンデータを集約したプラットフォームそこで、米国を中心に世界のオープンデータを集約した、オープンソース型プラットフォーム「Plenario」が、米シカゴ大学によって開発された。期間と場所を指定するだけで、行政区分を問わず、条件に合致するすべてのオープンデータをリスト表示。各データは、このプラットフォームから直接ダウンロードできる仕組みとなっている。たとえば、シ

rishida 2015/08/24

データ

リンク

GitHub - cayleygraph/cayley: An open-source graph database

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

rishida 2015/08/05

“cayley”

データ

リンク

Python for geospatial data processing

Con el objetivo de potenciar el desarrollo de producto e impulsar la utilización de herramientas disruptivas, basadas en inteligencia artificial y machine learning, en 2018 sumamos a nuestro equipo a Machinalis. La empresa se ha convertido en un actor clave del ecosistema de Mercado Libre, proporcionando soluciones de aprendizaje automático a las industrias de tecnología minorista y tecnología fin

rishida 2015/08/05

リンク

はてなブックマーク

タグ

関連タグで絞り込む (10)

データに関するrishidaのブックマーク (59)

お知らせ

今週のはてなブックマーク数ランキング（2024年10月第2週）

今週のはてなブックマーク数ランキング（2024年10月第1週）

月間はてなブックマーク数ランキング（2024年9月）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス