Seung Hyun Lee1, Sieun Kim 1, Jaehwan Jeong 1, Innfarn Yoo2, Feng Yang2, Donghyeon Cho1, Youngseo Kim1, Huiwen Chang2, Jinkyu Kim3,*, Sangpil Kim1,* We propose a method for adding sound-guided visual effects to specific regions of videos with a zero-shot setting. Animating the appearance of the visual effect is challenging because each frame of the edited video should have visual changes while mai