Loading...

Microsoft’s VASA-1 can generate realistic talking faces from just one image

TL;DR

  • A research paper from Microsoft introduced a research project to generate talking heads.
  • The new AI model can generate a talking face or head by uploading a single photo and a voice note.
  • The animated face has realistic facial expressions and lip movements to match the voice with real life head movements.

In a recent white paper, Microsoft introduced a new AI model that produces a talking head that looks and sounds realistic and is generated by only uploading a still photograph and a voice sample.

The new model is named VASA-1, and it requires only one portrait style picture and an audio file of voice and fuses them together to make a short video of a talking head with facial expressions, lip syncing, and head movements. The produced head can even sing songs, and that in the voice uploaded at the time of creation.

Microsoft VASA-1 is a breakthrough for animation

According to Microsoft, the new AI model is still in the research phase, and there are still no plans to release it to the general public, and only Microsoft researchers have access to it. However, the company shared quite a few samples of the demonstrations, which show stunning realism and lip movements that seem to be too lifelike.

Source: Microsoft.

The demo shows people who look real, as if they were sitting in front of a camera and getting filmed. The movements of the heads are realistic and look quite natural, and the lip movement to match the audio is quite outstanding, provided that there seems very little to be noted anything for not being natural. The overall mouth synchronization is phenomenal.

Microsoft said the model was developed to animate virtual characters, and it claimed that all the people shown in the demo are synthetic, as they said, the models were generated from DALL-E, which is the image generator of OpenAI. So we think if it can animate an AI generated model, then obviously there is much more potential in it to animate photos of any real person, which should be more realistic and much easier for it to handle.

Use cases of Vasa-1 and its potential misuse

Source: Microsoft.

If we look at the potential of VASA-1 for practical usage, then on the baseline, it can be used to animate characters in animated movies, which will give characters a more realistic feel with natural facial expressions and head movements. Another use could be in video games, for the very same reason, think of Grand Theft Auto and the like. In the future it may be used for hyper realistic AI generated movies or series where characters can be generated from image generators and could be animated by VASA-1, and the audience may not even feel that the characters are not humans.

Along with creative use of the tool, it can also be leveraged to create content for malicious purposes. The potential misuse of VASA-1 could be its utilization for deepfakes, as it will make it easy for anyone involved in deepfake creations to scale up their bad tactics and generate more realistic misguiding content. Remember the robocall scandal involving Biden’s voice to refrain people from voting before a primary election? Now it could be a robovideo after the robocall, and that with very realistic human expressions.

The potential risk of misuse is may be the reason that Microsoft has limited its testing to its researchers only. According to Microsoft researchers, the tool can be used for creating misleading and deceivable content for impersonating humans, like some other tools, but they are aiming for positive use applications. Nvidia and Runway AI have also released their models for the same function, but VASA-1 seems far more realistic and a promising candidate. 

The research paper can be seen here, and Microsoft’s note here.

Disclaimer: The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decision.

Share link:

Aamir Sheikh

Amir is a media, marketing and content professional working in the digital industry. A veteran in content production Amir is now an enthusiastic cryptocurrency proponent, analyst and writer.

Most read

Loading Most Read articles...

Stay on top of crypto news, get daily updates in your inbox

Related News

Tech
Cryptopolitan
Subscribe to CryptoPolitan