Enable 2D Lip Sync Wav2Lip Pipeline with OpenVINO Runtime

No items found.

Authors: Xiake Sun, Kunda Xu

1. Introduction

Lip sync technologies are widely used for digital human use cases, which enhance the user experience in dialog scenarios.

Wav2Lip is a novel approach to generate accurate 2D lip-synced videos in the wild with only one video and an audio clip. Wav2Lip leverages an accurate lip-sync “expert" model and consecutive face frames for accurate, natural lip motion generation.

In this blog, we introduce how to enable and optimize Wav2Lippipeline with OpenVINOTM.

Here is Wav2Lip pipeline overview:

Figure 1: Wav2Lip pipeline overview

2. Setup Environment

$ git clone https://github.com/sammysun0711/openvino_aigc_samples.git
$ cd Wav2Lip
$ conda create -n wav2lip python=3.8
$ conda activate wav2lip
$ pip install -r requirments.txt
$ sudo apt-get install ffmpeg

Download the Wav2lip pytorch model from link and move it to the checkpoints folder.

3. Pytorch to OpenVINOTM Model Conversion

$ python export_openvino.py

The exported OpenVINOTM model will be saved in the checkpoints folder.

4. Run pipeline inference with OpenVINOTM Runtime

$ python inference_ov.py --face_detection_path checkpoints/face_detection.xml --wav2lip_path checkpoints/wav2lip.xml --inference_device CPU --face data_video_sun_5s.mp4 --audio data_audio_sun_5s.wav

Here are the parameters with descriptions:

--face_detection_path: path of face detection OpenVINOTMIR

--wav2lip_path: path of wav2lip openvinoTM IR

--inference_device: specify the device to run OpenVINOTMinference.

--face: input video with face information

--audio: input audio with voice information

--static: set True to use single frame for face detection for fast inference

The generated video will be saved as results/result_voice.mp4

Here is an example to compare original video and generated video after the Wav2Lip pipeline:

Figure 2: Original input video
Figure 3: Wav2Lip generated video

5. Conclusion

In this blog, we introduce how to deploy wav2lip pipeline with OpenVINOTM as follows:

  • Support Pytorch model to OpenVINOTM model conversion.
  • Run and optimize wav2lip pipeline with OpenVINOTM runtime.