[B! nlp][multiModal] manboubirdのブックマーク

manboubird id:manboubird

nlpとmultiModalに関するmanboubirdのブックマーク (6)

【日本語モデル付き】2022年にマルチモーダル処理をする人にお勧めしたい事前学習済みモデル - Qiita
要点 OpenAI CLIPの日本語モデルを作り、公開しました。ご活用ください。 CLIPとは画像とテキストの埋め込みモデル（意味を表す固定長のベクトルに変換するモデル）であり、意味が近い画像とテキスト同士が近いベクトルになるという性質を持っています。4億枚の多様な画像とテキストのペアを用いて学習されており、高いゼロショット性能を備えています。応用例：テキストによる画像の検索、類似画像検索、画像 and/or テキストの分類、クラスタリング、画像やテキストの特徴量生成など日本語CLIPモデルはHugging Face Model Hubからダウンロードできます。応用方法を理解するためのサンプルコードとその解説を、4つの記事にして順次公開する予定です。進捗状況: 1/4。日本語CLIPモデルの使い方、サンプルコード（鋭意作成中）長くなるので使い方の解説は別の記事にしました。すぐに
manboubird 2023/01/22
openAi

clip

search

multimodal

nlp

multilingual
リンク
Bootstrapping a multimodal project using MMF, a PyTorch powered MultiModal Framework
manboubird 2021/10/20
multimodal

nlp

computerVision

facebook

mmf
リンク
MMF
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR). Less BoilerplateMMF is designed from ground up to let you focus on what matters -- your model -- by providing boilerplate code for distributed training, common datasets and state-of-the-art pretrained baselines out-of-the-box. Powered by PyTorchMMF is built on top of PyTorch that brings all of its power
manboubird 2021/10/19
multimodal

facebook

mmf

computerVision

nlp

lib
リンク
Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images - MIT
♰ Universitat Politecnica de Catalunya ✦ Massachusetts Institute of Techno logy ✥ Qatar Computing Research Institute Abstract In this work we train a neural network to learn a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Moreover, we demonstrate that regularization via the addition of a high-level classification objective both improves retr
manboubird 2021/06/19
knowledgeGraph

food

multimodal

nlp

computerVision

paper

cvpr

recipe

cooking
リンク
From image to language and back again (Journal Article) | NSF PAGES
manboubird 2021/05/23
paper

computerVision
リンク
Visual Recognition with Text - Winter 2017
manboubird 2021/05/15
computerVision

multiModal

lecture

knowledgeGraph

nlp

trontoUniversity
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx