Site areas:
Media Art between Natural and Artificial IntelligenceProgramm

Synthesizing Obama: Learning Lip Sync from Audio

Supasorn Suwajanakorn (TH), Steven Seitz (US), Ira Kemelmacher-Shlizerman (IL)

Given audio of President Barack Obama, the scientists synthesize a high-quality video of him speaking with accurate lip sync, composited into a target video clip.

Trained on many hours of his weekly address footage, a recurrent neural network learns the mapping from raw audio features to mouth shapes. Given the mouth shape at each time instant, the artists synthesize high-quality mouth texture, and composite it with proper 3D pose matching to change what he appears to be saying in a target video to match the input audio track. This approach produces impressive photorealistic results.


GRAIL Lab @ University of Washington