moro-tyoのブックマーク - はてなブックマーク

GitHub - lutzroeder/netron: Visualizer for neural network, deep learning and machine learning models

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

moro-tyo 2023/03/19

deeplearning

リンク

ONNXモデルのチューニングテクニック (基礎編)

基礎編 / 応用編１ / 応用編２サイバーエージェント AI Lab の Conversational Agent Teamに所属している兵頭です。今回は私が半年ほど蓄積したONNXのチューニングテクニックを全てブログに残したいと思います。皆さんが既にご存知であろう基本的なことから、かなりトリッキーなチューニングまで幅広くご紹介したいと思います。長文になりますがご容赦願います。このブログのメインターゲット層は「リサーチャーが実装したモデルを実環境へデプロイするタスクを有する方々」です。一部リサーチャーの方々の参考になる情報が混じっていることもあるかもしれませんが、あまり興味を引かない内容だとは思います。リサーチャーメインの組織に属しながらリサーチエンジニアの立ち位置で身を投じていますので、研究の観点の少し手前あるいは少しその先の部分を担っている立場からこのブログを記載しているものとご認

moro-tyo 2023/03/19

deeplearning

リンク

Google Research, 2022 & beyond: Algorithms for efficient deep learning

Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more

moro-tyo 2023/03/11

deeplearning

リンク

Civitai: The Home of Open-Source Generative AI

All sorts of cool pictures created by our community, from simple shapes to detailed landscapes or human faces. A virtual canvas where you can unleash your creativity or get inspired. All sorts of cool pictures created by our community, from simple shapes to detailed landscapes or human faces. A virtual canvas where you can unleash your creativity or get inspired.

moro-tyo 2023/03/11

リンク

Google Dataset Search

‫العربية‬‪Deutsch‬‪English‬‪Español (España)‬‪Español (Latinoamérica)‬‪Français‬‪Italiano‬‪日本語‬‪한국어‬‪Nederlands‬Polski‬‪Português‬‪Русский‬‪ไทย‬‪Türkçe‬‪简体中文‬‪中文（香港）‬‪繁體中文‬

moro-tyo 2023/03/05

dataset

リンク

DiffusionによるText2Imageの系譜と生成画像が動き出すまで

2022年を境に爆発的な流行を見せはじめた AI 画像生成。コアとなる拡散モデルの基礎解説、研究領域で育てられた技術が一般層にまで羽撃いた変遷、その過程で生じた解決されるべき問題点、および日進月歩で増え続ける発展的な手法群について、網羅的に流れを追いかけるメタサーベイを作成しました。明日にでも世界が一変しうる流動的な分野において、情報のまとめとは必然的に古くなっていくものです。そんな奔流の中にあっても、本資料が、これまでの歴史を俯瞰し、これからの成長を見据えるための礎として、幾許かでも皆様のお役に立てればと心より願い、ここに筆を置きます。

moro-tyo 2023/03/01

リンク

Google Cloud Model Cards

moro-tyo 2023/02/26

machinelearning

リンク

Zero-shot Image-to-Image Translation

1 Carnegie Mellon University 2 Adobe Research SIGGRAPH 2023 We propose pix2pix-zero, a diffusion-based image-to-image approach that allows users to specify the edit direction on-the-fly (e.g., cat to dog). Our method can directly use pre-trained text-to-image diffusion models, such as Stable Diffusion, for editing real and synthetic images while preserving the input image's structure. Our method i

moro-tyo 2023/02/08

リンク

GitHub - salesforce/LAVIS: LAVIS - A One-stop Library for Language-Vision Intelligence

[Model Release] November 2023, released implementation of X-InstructBLIP Paper, Project Page, Website, A simple, yet effective, cross-modality framework built atop frozen LLMs that allows the integration of various modalities (image, video, audio, 3D) without extensive modality-specific customization. [Model Release] July 2023, released implementation of BLIP-Diffusion Paper, Project Page, Website

moro-tyo 2023/02/08

deeplearning

リンク

Prompt-to-Prompt

Prompt-to-Prompt Image Editing with Cross-Attention Control Amir Hertz1,2 Ron Mokady1,2 Jay Tenenbaum1 Kfir Aberman1 Yael Pritch1 Daniel Cohen-Or1,2 1 Google Research 2 Tel Aviv University Paper Code Abstract Recent large-scale text-driven synthesis diffusion models have attracted much attention thanks to their remarkable capabilities of generating highly diverse images that follow given text pr

moro-tyo 2023/02/08

リンク

TimeSformer：3DCNNを超えて動画像を捉えるTransformer

3つの要点 ✔️ 動画像のための時空間Self-Attentionを4種考案した． ✔️ 3DCNNモデルと比較して，学習速度が速く，テスト効率が向上した． ✔️ 3DCNNモデルでは数秒の動画しか処理できなかったが，数分の長い動画に適用することも可能になった． Is Space-Time Attention All You Need for Video Understanding? written by Gedas Bertasius, Heng Wang, Lorenzo Torresani (Submitted on 9 Feb 2021 (v1), last revised 9 Jun 2021 (this version, v4)) Comments: Accepted to ICML 2021 Subjects: Computer Vision and Pattern Reco

moro-tyo 2023/02/05

リンク

Dreamix: Video Diffusion Models are General Video Editors

Eyal Molad*,1, Eliahu Horwitz*,1,2, Dani Valevski*,1, Alex Rav Acha1, Yossi Matias1, Yael Pritch1, Yaniv Leviathan†,1, Yedid Hoshen†,1,2 1Google Research, 2The Hebrew University of Jerusalem *Indicates Equal Contribution, †Indicates Equal Advising Given a video and a text prompt, Dreamix edits the video while maintaining fidelity to color, posture, object size and camera pose, resulting in a tempo

moro-tyo 2023/02/03

リンク

Google Research, 2022 & beyond: Language, vision and generative models

Language Models The progress on larger and more powerful language models has been one of the most exciting areas of machine learning (ML) research over the last decade. Important advances along the way have included new approaches like sequence-to-sequence learning and our development of the Transf ormer model, which underlies most of the advances in this space in the last few years. Although langu

moro-tyo 2023/02/02

deeplearning

リンク

Google Research, 2022 & beyond: Responsible AI

Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more

moro-tyo 2023/02/01

deeplearning

リンク

画像認識を用いたZOZOTOWN商品に対するシーン・スタイルタグ予測 - ZOZO TECH BLOG

はじめにこんにちは。ML、データ部データサイエンス2ブロックの吉本です。 ZOZOTOWNの商品には「長袖」「クルーネック」「花柄」といった、アイテムの特徴を示すタグ（アイテム特徴タグ）や「ベーシック」「モード」「結婚式」といった、アイテムに合うシーンやスタイルを表すタグ（シーン・スタイルタグ）が付与されています。これらは商品情報の登録時、ブランドさんに付与していただいているものです。これらタグに関する課題として、タグ付与の手間、シーン・スタイルタグのタグ付与率の低さがあります。アイテム特徴タグは例えばTシャツ/カットソーカテゴリでは約50種類、シーン・スタイルタグは約130種類のタグがあり、一つ一つの商品に対してこれらの中から該当するものを選んで付与することは手間のかかる作業となります。またシーン・スタイルタグについてはZOZOTOWNに導入されてから2年弱とまだ日が浅いことから、認

moro-tyo 2023/01/30

リンク

GitHub - google-research/tuning_playbook: A playbook for systematically maximizing the performance of deep learning models.

This document is for engineers and researchers (both individuals and teams) interested in maximizing the performance of deep learning models. We assume basic knowledge of machine learning and deep learning concepts. Our em phasis is on the process of hyperparameter tuning. We touch on other aspects of deep learning training, such as pipeline implementation and optimization, but our treatment of tho

moro-tyo 2023/01/24

deeplearning

リンク

世界に衝撃を与えた画像生成AI「Stable Diffusion」を徹底解説！ - Qiita

追記: U-Netの中間層は常にSelf-Attentionとなります。ご指摘いただきました。ありがとうございます。（コード）オミータです。ツイッターで人工知能のことや他媒体の記事などを紹介しています。 @omiita_atiimoもご覧ください！世界に衝撃を与えた画像生成AI「Stable Diffusion」を徹底解説！未来都市にたたずむサンタクロース（Stable Diffusionで生成） 2022年8月、世界に大きな衝撃が走りました。それは、Stable Diffusionの公開です。Stable Diffusionは、テキストを受け取るとそれに沿った画像を出力してくれるモデルです1。Stable Diffsuionは10億個近いパラメータ数をもち、およそ20億個の画像とテキストのペア（LAION-2B）で学習されています。これにより、Stable Diffusionは入

moro-tyo 2023/01/14

リンク

VOICEVOX | 無料のテキスト読み上げソフトウェア

オープンソースVOICEVOX は OSS（オープンソース・ソフトウェア）版 VOICEVOX をもとに構築されています。製品版と OSS 版の違いやモジュール構成は VOICEVOX の全体構成をご参照ください。ソフトウェア部分は Electron + Vue 、音声合成エンジン部分は Python + FastAPI です。追加したい・改善したい機能があれば、ぜひ開発にご参加ください。

moro-tyo 2023/01/06

リンク

Muse: Text-To-Image Generation via Masked Generative Transformers

Muse: Text-To-Image Generation via Masked Generative Transf ormers Huiwen Chang*, Han Zhang*, Jarred Barber†, AJ Maschinot†, José Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein†, Yuanzhen Li†, Dilip Krishnan† *Equal contribution. †Core contribution. We present Muse, a text-to-image Transf ormer model that achieves state-of-the-art image generation performance

moro-tyo 2023/01/06

リンク

必要保障額と適切な保険料をシミュレーション｜中立的立場にたった情報サイト

知りたい保障死亡保険 ?あなたが亡くなった場合に家族のその後の生活を支えるため、定期型死亡保険と収入保障保険の必要保障額を同時に計算します。医療保険 ?入院した場合の収入減少や治療費に備える保険の必要保障額を計算します。就業不能 ?病気やケガで働けなくなった場合の保険の必要保障額を計算します。計算条件を入力

moro-tyo 2023/01/03

リンク

はてなブックマーク

タグ

moro-tyoのブックマーク (2,863)

お知らせ

今週のはてなブックマーク数ランキング（2024年4月第2週）

今週のはてなブックマーク数ランキング（2024年4月第1週）

月間はてなブックマーク数ランキング（2024年3月）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス