タグ

関連タグで絞り込む (0)

  • 関連タグはありません

タグの絞り込みを解除

tikaとluceneに関するHayatoのブックマーク (1)

  • Apache Tika – Apache Tika

    Apache Tika - a content analysis toolkit The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more. You can find the latest release on the download page. Please see

    Hayato
    Hayato 2009/06/22
    Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
  • 1