Microsoft takes on OpenAI’s Sora with a cutting-edge AI tool capable of turning a static image into a ‘Talking Tom’

What you need to know

Microsoft has launched VASA, a new tool capable of turning a static image into a short clip by leveraging AI capabilities.
The framework supports 512×512 videos at up to 40 FPS with negligible latency.
Microsoft is exploring different avenues to ensure the tool is used responsibly before releasing it to the general public.

Microsoft recently unveiled VASA — a new framework that generates “lifelike talking faces of virtual characters with appealing visual affective skills (VAS), given a single static image and a speech audio clip.”

VASA-1 can transform a static image into a short clip by producing lip movements that perfectly synchronize with a speech audio clip. Interestingly, the sophisticated cutting-edge technology makes the AI-generated creation lifelike by “capturing a large spectrum of facial nuances and natural head motions that contribute to the perception of authenticity and liveliness.”

Will Microsoft’s VASA fuel widespread deepfakes?

AI deepfake

With the emergence of AI, there’s been an increase in deepfakes emerging across social media platforms and widespread AI-generated misinformation about elections. And now, with a sophisticated tool such as VASA-1 capable of delivering high video quality with lifelike facial and head dynamics from static images, a major concern might be how this will impact factual and credible news or information from the internet.

The tool supports 512×512 videos at up to 40 FPS with negligible latency. As it happens, I recently stumbled on a video similar to Microsoft’s VASA-generated clips on LinkedIn. I noticed the video was rather off in some aspects like the tone, lip, and head movements.

As more people continue to embrace AI, tools like VASA and Image Creator from Designer will improve at generating images and clips. They are already raising concerns among professionals in the built environment industry, as they are good at generating structural designs and could render them obsolete.

We recently reported on a bizarre incident where a popular Canadian rapper used AI to generate a verse using a deceased rapper’s voice without his estate’s approval and featured it in a track. Similarly, the flow on the diss track was off, but the deceased rapper’s voice was uncanny.

Microsoft indicates it has no plans to release “an online demo, API, product, additional implementation details, or any related offerings,” till it has elaborate measures to regulate and ensure the tool’s offerings are used responsibly.

Originally Appeared Here

Pages

Categories

Microsoft takes on OpenAI’s Sora with a cutting-edge AI tool capable of turning a static image into a ‘Talking Tom’

What you need to know

Will Microsoft’s VASA fuel widespread deepfakes?

About the Author:

What you need to know

Will Microsoft’s VASA fuel widespread deepfakes?

You May Also Like

Beatbot AquaSense and Sora robot pool cleaners: Which model is right for your pool?

Google Launches Gemini Omni to Fill Gap Left by OpenAI’s Sora, Expands AI Creation Tools

Sora shutdown leaves Critterz at the Cannes market without its model

Google’s Gemini Omni Tries to Fill the Void Left by OpenAI’s Sora

OpenAI Sora Comes to Android, Bringing New Pressure to TikTok and YouTube Creators

Disney Exits OpenAI Deal After AI Giant Shutters Sora

About the Author: