[B! resource] koljaのブックマーク

kolja id:kolja

resourceに関するkoljaのブックマーク (20)

Endangered Languages Project
kolja 2012/06/22
resource

google

linguistics
リンク
寺村誤用例集データベース
寺村秀夫『外国人学習者の日本語誤用例集』（大阪大学、1990年）のPDF版とデータベース版の公開にあたって国立国語研究所所長　影山太郎このたび、国立国語研究所では、日本語教育研究・情報センターで行われている２つの共同研究プロジェクト ― 「多文化共生社会における日本語教育研究」（プロジェクトリーダー迫田久美子）および「日本語学習者用基本動詞用法ハンドブックの作成」（プロジェクトリーダープラシャント・パルデシ）― における研究の一環として、我が国の日本語教育研究の礎を築かれた故寺村秀夫教授（1928～1990）が大阪大学で残された最後の仕事の１つである『外国人学習者の日本語誤用例集』(1990)を、先生のご遺族の承諾を得て電子化し、オンラインで公開することになりました。奇しくも、所長の私が、大阪外国語大学時代の恩師である寺村先生のお仕事をこのような形で受け継ぐことができることは、大
kolja 2011/12/10
linguistics

resource

Corpus
リンク
Santa Barbara Corpus of Spoken American English
The Santa Barbara Corpus of Spoken American English is based on a large body of recordings of naturally occurring spoken interaction from all over the United States. The Santa Barbara Corpus represents a wide variety of people of different regional origins, ages, occupations, genders, and ethnic and social backgrounds. The predominant form of language use represented is face-to-face conversation,
kolja 2011/10/19
corpus

linguistics

resource
リンク
データベース・コーパス・資料 | 国立国語研究所
大学共同利用機関法人　人間文化研究機構国立国語研究所〒190-8561　東京都立川市緑町10-2【交通案内】 Tel. 0570-08-8595 (ナビダイヤル) (c) National Institute for Japanese Language and Linguistics
kolja 2011/06/04
resource

papers

linguistics
リンク
文章の作り方 - 伝わるデザイン研究発表のユニバーサルデザイン
研究者や研究に関わる大学生や大学院生は、一年を通じて研究室ゼミや学会などで研究成果の発表を行なわなければなりません。また、近年、科学者でない人たちに対する一般向けのプレゼンや講演（アウトリーチ活動）の機会も増えてきています。他にも、研究論文や報告書を書いたり、研究費調達のために予算申請書やプロジェクトの提案書を作成したりすることも、研究者にとって欠かせない仕事です。これらはいずれも情報を他者（研究仲間や審査員、一般市民）へ伝えようとする行為であり、正確かつ効果的な情報の発信が望まれます。しかし、自己流で資料を作成して、闇雲に情報を発信していても、スムーズに情報は伝わりません。ときには誤った情報が伝わってしまい、研究の価値を正当に評価してもらえないことさえ起こりえるのです。情報を正確にかつスムーズに他者に伝えるためには、情報をデザインすること、つまり文章を読みやすく整えたり、図表を見やすく
kolja 2010/08/04
presentation

tips

resource
リンク
Scheme NLTK: Home
kolja 2010/07/10
computer

linguistics

scheme

tool

resource

NLP
リンク
知識探索サイト：ジャパンナレッジ Japan Knowledge　約40種類の辞事典を搭載した、インターネット辞書・事典検索サイト
「現代用語の基礎知識」が2009年版に更新されました。 2009年版では、おなじみの"流行語大賞"や"世の中ペディア"のほか、書籍版の別冊付録、「世界の国と人々学習帳」も搭載！常に流動し続ける"今"を知るツールとしてぜひご活用ください！
kolja 2010/07/06
web service

resource
リンク
Stress Pattern Database
Welcome This database presents atheoretical descriptions of the documented dominant stress patterns of the world's languages. This is accomplished in a few ways: by presenting an English prose description of the placement of stress by extending Bailey's (1995) Syllable Priority Code system to handle secondary stress and by including a finite-state automaton representation of each stress pattern.
kolja 2010/06/29
全世界の言語の強勢データベース！

linguistics

phonetics

phonology

resource
リンク
『英辞郎 on the WEB』整列・頻度集計機能ヘルプ
本サービスは、β 版として運用中です。諸事情により予告なくサービスを停止する場合もございますので、あらかじめご了承ください。『英辞郎 on the WEB』整列・頻度集計について『英辞郎 on the WEB』の「整列」、「頻度集計」機能を利用すると、検索キーワードがどのような単語と共起する（いっしょに使われる）ことが多いのかを、『英辞郎 on the WEB』のデータから簡単に調べることができます。ここで提供する機能は、大量の文書を分析対象とするコーパス言語学では一般的な分析手法ですが、「この動詞はこの名詞を目的語に取ることが多い」といったことや「この動詞がこの前置詞をともなうときはイディオムとして使われているようだ」といったことがわかるため、英文ライティングなどにも応用できるものです。「整列」機能： "KWIC"（Keyword In Context の略で、クウィックと
kolja 2010/02/18
resource

computer

linguistics
リンク
CasualConc
CasualConc は macOS 用のコンコーダンサー（コーパス分析ソフト）です。最初のバージョンは高度な研究に耐えるものでもなく、簡単に使えるという意味で CasualConc と名付けました。機能としては kwic、単語クラスター分析、共起分析、単語頻度表作成などがあります。現在のバージョン (3.0 以降) は、十分実用・研究に耐えうる程度にはなっていると思います。これ以外にも、いろいろなアプリケーションを作ってます。このページの下の方か、左のその他のアプリケーションのリンクをたどってください。 CasualTranscriber をお探しの方はこちらへ。まだまだ対処しきれていない問題も多く残っているので、バグの報告をお願いします。テキストファイルフォーマット：プレインテキストファイル (.txt) で、ASCII もしくは UTF-8 でエンコードされているものが基本で
kolja 2009/10/26
computer

linguistics

resource
リンク
FrameNet MT tools
FrameNet MT tools FrameNet Japanese MT Project This directory contains resources developed by the Japanese machine translation project in the NSF funded Framenet online effort. The directories and their contents are described. bin: Tool used in Japanese frames creation effort. Described in doc directory and in http://bulba.sdsu.edu/enoue/vgrep_help.html http://bulba.sdsu.edu/enoue/vgrep_manual.htm
kolja 2009/07/27
computer

linguistics

resource
リンク
Springer Exemplar
Providing researchers with access to millions of scientific documents from journals, books, series, protocols, reference works and proceedings.
kolja 2009/07/20
うわあこれすごい便利そう。

computer

English

howto

resource
リンク
https://www.americancorpus.org/
kolja 2009/06/16
resource

computer

English

linguistics
リンク
SSWL: directory: homepage
SSWL - MIGRATED TO TERRALING Syntactic Structures of the World's Languages As of 9/27/2017 This site will no longer be used for data entry or administrative purposes. The SSWL database has migrated permanently to its new home at Terraling Please reset your password on Terraling (sign in with your username), and reset your password to access SSWL. SSWL is a searchable database that allows users to
kolja 2009/06/10
computer

resource

linguistics
リンク
Create and search a text corpus | Sketch Engine
Sketch Engine is the ultimate tool to explore how language works. Its algorithms analyze authentic texts of billions of words (text corpora) to identify instantly what is typical in language and what is rare, unusual or emerging usage. It is also designed for text analysis or text mining applications. Sketch Engine is used by linguists, lexicographers, translators, students and teachers. It is a f
kolja 2009/05/29
resource

English
リンク
NLTK :: Natural Language Toolkit
Natural Language Toolkit¶ NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an ac
kolja 2009/03/30
computer

resource
リンク
動詞項構造シソーラス
人の言葉をコンピュータで処理するために動詞の概念を整理したコンピュータ用の辞書を構築してfreeで配布しています(現在4425語(7473語義))．動詞の概念は名詞と異なり，係り関係のある名詞との形が重要になってきます．例えば彼が(Agent)秘密を(Theme)握る → 秘密を(Theme)得る彼が(Agent)おにぎりを(Theme)握る → おにぎりを(Theme)作るのように表層の格(ガ，ヲ..)と深層格(Agent, Theme,..)との組み合わせで動詞の持つ意味が異なり，他の動詞との関係が変わってきます(言語学の語彙意味論では深層格と動詞の関係を項構造と呼びます)．このような関係を全ての名詞と動詞の組で記述するのは不可能ですが，それを推測するのに必要な構築可能な動詞辞書の構築を行っています．具体的な方針としては語義を仮定して，語義ごと
kolja 2009/03/06
computer

linguistics

resource
リンク
日本語 WordNet (wn-ja)
日本語 WordNet リリース * 画像 * ダウンロード * 今後の予定 * 参考文献 * リンク * English 本プロジェクトでは、 Princeton WordNet や Global WordNet Gridに着想をえて、日本語のワードネットを構築し、オープンで公開します。独立行政法人情報通信研究機構（NICT）では、自然言語処理研究をサポートする一環として、2006年に日本語ワードネットの開発を開始しました。最初の版、version 0.9は、2009年2月にリリースされました。このversion 0.9は、 Princeton WordNetのsynsetに対応して日本語をつけています。もちろん、 Princeton WordNetにはない日本語synsetを付与する必要があり、また、 Princeton WordNetにみられるsynsetの階層構造に、
kolja 2009/03/06
すばらしい

linguistics

computer

resource
リンク
Links to CORPUS Site
On-Line Corpus（2006/03/05） ◆　Screen Play 映語犬サク　　映画のシナリオで英語表現の検索 ◆　単語リスト作成（lem maPlus) 佐藤弘明氏（簡単に単語リストが作れます） ◆　小学館コーパスネットワーク　　（ＢＮＣとWord Banksが使えます） ◆　徳島大学の Sudachi KWIC Search (SKWICS) ：米（約165万語）・英（約180万語）・豪（約165万語）の英字新聞（2000-2003年） ◆　Public Domain modern English Search ：ミシガン大学が公開している文学作品を中心とした１００Ｍ程度のコーパスの語句検索が可能。 ◆　BNC On-Line ：Longmann社・オックスフォード大学を中心として構築した１億語規模の現代英語コーパスを検索利用できる。 ◆　ANC (American Na
kolja 2009/02/19
computer

linguistics

resource
リンク
UMass Amherst Linguistics Sentiment Corpora
Noah Constant, Christopher Davis, Christopher Potts, and Florian Schwarz The UMass Amherst Linguistics Sentiment Corpora consist of n-gram counts extracted from over 700,000 online product reviews in Chinese, English, German, and Japanese. The files are UTF-8 encoded text. They are formatted to be read in as R data frames, but they can easily be manipulated with other tools. We are releasing them
kolja 2009/01/05
linguistics

computer

English

resource
リンク
1