Pre-recorded Sessions: From 4 December 2020 | Live Sessions: 10 – 13 December 2020

4 – 13 December 2020

#SIGGRAPHAsia | #SIGGRAPHAsia2020

Technical Papers

  • Ultimate Supporter Ultimate Supporter
  • Ultimate Attendee Ultimate Attendee

Date/Time: 04 – 13 December 2020
All presentations are available in the virtual platform on-demand.


Lecturer(s):
Yang Zhou, University of Massachusetts Amherst, United States of America
Xintong Han, Huya Inc, China
Eli Shechtman, Adobe Research, United States of America
Jose Echevarria, Adobe Research, United States of America
Evangelos Kalogerakis, University of Massachusetts Amherst, United States of America
Dingzeyu Li, Adobe Research, United States of America

Bio:

Description: We present a method that generates expressive talking-head videos from a single facial image with audio as the only input. In contrast to previous attempts to learn direct mappings from audio to raw pixels for creating talking faces, our method first disentangles the content and speaker information in the input audio signal. The audio content robustly controls the motion of lips and nearby facial regions, while the speaker information determines the specifics of facial expressions and the rest of the talking-head dynamics. Another key component of our method is the prediction of facial landmarks reflecting the speaker-aware dynamics. Based on this intermediate representation, our method works with many portrait images in a single unified framework, including artistic paintings, sketches, 2D cartoon characters, Japanese mangas, and stylized caricatures. In addition, our method generalizes well for faces and characters that were not observed during training. We present an extensive quantitative and qualitative evaluation of our method, in addition to user studies, demonstrating generated talking-heads of significantly higher quality compared to prior state-of-the-art methods.

 

Back