Google’s DeepMind ‘V2A’ AI technology can create soundtracks for videos based on both their pixels and your text … – MusicRadar

Its one thing to have AI that can create videos for you, but what if you want them to have sound, too? Googles DeepMind team now says that its come up with some video-to-audio (V2A) technology that can generate soundtracks - music, sound effects and speech - both from text prompts and the videos pixels.

This is the kind of news that might have soundtrack composers shuffling awkwardly in their seat - all the more so because, as well as being able to work with automatic video generation services, V2A can also be applied to existing footage such as archive material and silent movies.

The text prompt aspect is interesting because, as well as being able to input positive prompts that will guide the audio in the direction you want, you can also add negative prompts which tell the AI to avoid certain things. This means that you can generate a potentially infinite number of different soundtracks for any one piece of video.

This clip was generated using the prompt "A drummer on a stage at a concert surrounded by flashing lights and a cheering crowd".

The system is also capable of creating audio using just video pixels, so no text prompts are required if you dont want to use them.

Google DeepMind admits that V2A currently has some limitations - the quality of the audio is currently dependent on the quality of the video, and lip synchronisation when generating speech isnt perfect - but says that its doing further research in a bid to address these.

Find out more and check out further examples on the Google DeepMind website

Want all the hottest music and gear news, reviews, deals, features and more, direct to your inbox? Sign up here.

Go here to read the rest:
Google's DeepMind 'V2A' AI technology can create soundtracks for videos based on both their pixels and your text ... - MusicRadar

Related Posts

Comments are closed.