StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Yupeng Zhou1* Daquan Zhou2‡† Mingming Cheng1 Jiashi Feng2 Qibin Hou1‡†
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Yupeng Zhou1* Daquan Zhou2‡† Mingming Cheng1 Jiashi Feng2 Qibin Hou1‡†
We proposed EMO, an expressive audio-driven portrait-video generation framework. Input a single reference image and the vocal audio, e.g. talking and singing, our method can generate vocal avatar videos with expressive facial expressions, and various head poses, meanwhile, we can generate videos with any duration depending on the length of input video. Overview of the proposed method. Our framewor
リリース、障害情報などのサービスのお知らせ
最新の人気エントリーの配信
処理を実行中です
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く