
関連タグで絞り込む (1)


corpusとtwitterに関するfubaのブックマーク (1)

  • Tweets2011 Twitter Collection

    As part of the TREC 2011 microblog track, Twitter provided identifiers for approximately 16 million tweets sampled between January 23rd and February 8th, 2011. The corpus is designed to be a reusable, representative sample of the twittersphere - i.e. both important and spam tweets are included. The Tweets2011 corpus is unusual in that what you get is a list of tweet identifiers, and the actual twe

    fuba 2011/09/02
    TREC 2011 microblog track で使ったコーパス、ID だけくれるので自前でクロール、だるそうだけど作っといた方がいいのかな…
  • 1