Dissertation: Face2Face - Facial Reenactment

In this dissertation we show our advances in the field of 3D reconstruction of human faces using commodity hardware. Beside the reconstruction of the facial geometry and texture, real-time face tracking is demonstrated. The developed algorithms are based on the principle of analysis-by-synthesis. To apply this principle, a mathematical model that represents a face virtually is defined. In addition to the face, the sensor observation process of the used camera is modeled. Utilizing this model to synthesize facial imagery, the model parameters are adjusted, such that the synthesized image fits the input image as good as possible. Thus, in reverse, this process transfers the input image to a virtual representation of the face. The achieved quality allows many new applications that require a good reconstruction of the face. One of these applications is the so-called ‘‘Facial Reenactment’’. Our developed methods show that such an application does not need any special hardware. The generated results are nearly photo-realistic videos that show the transfer of the mimic of one person to another person. These techniques can for example be used to bring movie dubbing to a new level. Instead of adapting the audio to the video, which might also include changes of the text, the video can be post-processed to match the mouth movements of the dubber. Since the approaches that we show in this dissertation run in real-time, one can also think of a live dubber in a video teleconferencing system that simultaneously translates the speech of a person to another language. The published videos of our projects in this dissertation led to a broad discussion in the media. On the one hand this is due to the fact that our methods are designed such that they run in real-time and on the other hand that we reduced the hardware requirements to a minimum. In fact, after some preprocessing, we are able to edit ordinary videos from the Internet in real-time. Amongst others, we impose a different mimic to faces of prominent persons like former presidents of the United States of America. This led inevitably to a discussion about trustworthiness of video material, especially from unknown source. Most people did not expect that such manipulations are possible, neglecting existing methods that are already able to edit videos (e.g. special effects in movie productions). Thus, beside the advances in real-time face tracking, our projects raised the awareness of video manipulation.

This dissertation was nominated by the University of Erlangen-Nuremberg for the ‘Dissertationspreis der Gesellschaft der Informatik 2017’. It is covered by the magazin it - Information Technology and published in ‘Ausgezeichnete Informatikdissertationen 2017’

[Paper] [Bibtex]