From one photo you can get a frighteningly realistic video: Microsoft introduced the VASA-1 neural network

by alex

Many examples have been published

Microsoft researchers have developed a new system, VASA-1, that can create realistic talking faces from a single image and audio track.

VASA-1 can recreate facial expressions, precisely synchronized lip movements and natural head movements. The new neural network can capture a wide range of emotions and subtle nuances, making the generated faces more believable. Users can specify the character's viewing direction, perceived distance, and even the character's emotional state.

VASA-1 achieves this realism by separating facial features, 3D head position and facial expressions into separate parts. The researchers behind VASA-1 emphasize the system's real-time efficiency. It can create video with a resolution of 512 x 512 pixels at 45 frames per second.

You can see a lot of examples of how the technology works on the official website.

READ
Truly all-new Intel Meteor Lake processors have been introduced. True, with an abundance of new things, the results look a little old

You may also like

Leave a Comment