Home General Can You Differentiate Between a Human Voice and and AI Voice?

Can You Differentiate Between a Human Voice and and AI Voice?

Last Modified Date - May 22, 2020

You may be able to at the moment, but the way things are accelerating in the field of artificial intelligence (AI), that won’t last. In 2016, Google’s DeepMind AI was used to develop a new type of synthetic speech system. This system was named WaveNet.

Since WaveNet was first developed it’s undergone various improvements and is now even closer to sounding just like a human voice. One of the AI’s latest upgrades is getting a text-to-speech system installed called Tacotron 2.

The new system will effectively combine Tacotron 2’s deep neural networks with the ever efficient WaveNet. This upgrade enables text to be translated into visual audio frequencies which are then fed into WaveNet. A chart is then compiled that shows all the relevant audio elements.

The new system is so good that even the experts have a hard time differentiating between a human voice and the AI. Results from the study demonstrated that the AI achieved a mean opinion score (MOS) of 4.53, whereas a human achieved a MOS of 4.58. In layman’s terms, what this means is that it’s remarkably close to sounding like a real human. Even the company says “it sounds very much like a person speaking.”

MORE – Will Google AI Ever Rule the World?

As well as these synthetic voice systems, there are also AI’s around now that can generate images of humans that look real but aren’t. There’s even an AI out there that can create fake videos. While using AI to enhance art, music, or storytelling doesn’t sound too bad, the idea that a fake video of you can be created using AI is a scary thought.

WaveNet AI is hitting the headlines for a number of reasons. Firstly, it pronounces words much clearer than a lot of AI out there. And, it can even emphasize words when necessary according to punctuation such as exclamations. Of course, it still has its limitations though. At the moment the system is only trained to recognize one voice. If another person’s voice is to be work with the system it will need to be trained all over again.

However, once the system has been perfected it can be used in a number of applications, including Google Assistant. It’s possible that there are certain roles that could even be replaced once the system’s working as the company’s hoping it will.

Source Futurism