Mind Matters Natural and Artificial Intelligence News and Analysis
Illustration of abstract blue pink wireframe sound waves, visualization of frequency signals audio wavelengths, conceptual futuristic technology waveform background with copy space for text
Image licensed via Adobe Stock

Meta’s Weird New Speech AI

Scammers have already capitalized on this kind of technology. Is it a mistake for Meta to push for it?

Meta has announced a new AI system called Voicebox, a text to audio translator that can mimic the voice of loved ones. All you need is a mere two seconds of authentic audio and the bot will extrapolate whole sentences in that person’s voice. Meta noted the technology will be helpful for those who are visually impaired and who want to hear messages or texts read to them in a voice they know. A blog on the Meta site reads,

Like generative systems for images and text, Voicebox creates outputs in a vast variety of styles, and it can create outputs from scratch as well as modify a sample it’s given. But instead of creating a picture or a passage of text, Voicebox produces high-quality audio clips. The model can synthesize speech across six languages, as well as perform noise removal, content editing, style conversion, and diverse sample generation.

Introducing Voicebox: The first generative AI model for speech to generalize across tasks with state-of-the-art performance (facebook.com)

But as Maggie Harrison perceives it, the possibility for abuse is evident. She writes,

The concept of replicating your bestie’s voice is still a bit unsettling, not to mention ripe for abuse. After all, if you can replicate a friend’s voice with just a two-second sound clip, you could practically replicate anyone’s voice as long as you had the audio. It’s a potential lapse in safety that could give way to phishing scamsmisinformation, and even an audio version of deepfaked porn.

-Maggie Harrison, Facebook’s Creepy New AI Can Replicate Your Friends’ Voices (futurism.com)

We’ve already seen scary stories of the kinds of scams she’s talking about, in which criminals will take genuine audio of someone from the internet and then use it to terrify that person’s family members into a ransom payment, or something such. Fortunately, Meta is not yet making the speech model available to the public yet, since they too are aware of the potential risks and abuses.

The larger context, of course, is this: Big Tech companies are racing to implement AI into their products. The speed at which this happening is outpacing ethical and practical concerns. Meta is pressing pause on public use for now, but rest assured that as long as there’s a competitive element in the race for AI kingship, bots like Voicebox will be in our future.

Mind Matters News

Breaking and noteworthy news from the exciting world of natural and artificial intelligence at MindMatters.ai.

Meta’s Weird New Speech AI