Home General Baidu Develops New AI that Can Imitate your Voice by Listening for...

Baidu Develops New AI that Can Imitate your Voice by Listening for One Minute

With the current rapid technological advancement, it is hard to turn a blind eye to the mimicry being made possible by technology today. Previously, researchers created a deep learning-based AI with the ability to overlay a person’s face onto another individual’s body. They have done it again.

This time, researchers working at Baidu, a Chinese Internet search giant, have developed artificial intelligence (AI), which they claim can learn how to mimic your voice accurately upon listening to it for less than a minute.

A member of Baidu’s communication team, Leo Zou, acknowledged the new artificial intelligence (AI) in an interview with Digital Trends.

He said that the technology marks a groundbreaking achievement, which shows the ability to adapt speech synthesis, a sophisticated generative modeling issue, to new cases through learning from several examples efficiently.

READ MORE: What is Machine Learning? All You Need to Know

Previously, such a model would require numerous examples to learn as opposed to a fraction of what is needed today.

This situation is a clear testament of Baidu’s success in artificial intelligence research, and more precisely, speech synthesis technology.

Despite Baidu’s success, the company is not the first to develop a voice-mimicking AI. In fact, we looked at a project dubbed Lyrebird last year, which relied on neural networks to mimic voices using a relatively limited number of samples.

In fact, the project was successful in replicating the voices of the former and current president of the United States of America, Barrack Obama and Donald Trump respectively.

Similar to Lyrebird’s innovation, Baidu’s voice replicating technology is not completely convincing.

However, it represents a remarkable improvement compared to many other robotic AI-powered voice assistants developed in the past.

The development of the new artificial intelligence by Baidu is deep-rooted on the company’s text-to-speech generating system called Deep Voice. The system underwent audio training for more than 800 hours using a whopping 2,400 speakers.

READ MORE: Baidu Becomes First Chinese Company to Join US AI Ethics Body

READ MORE: Intel and Baidu Join Forces to Provide New AI Services in China

READ MORE: Baidu Unveils EZDL, an AI Model Training Platform Requiring No Programming Skills

It requires only 100 5-second parts of voice training data in a bid to sound its best. Nevertheless, a version of Deep Voice that was trained on merely ten 5-second examples or samples managed to trick a voice-recognition system over 95% of the time.

According to Leo Zou, Baidu sees powerful applications for their AI-powered voice-replicating technology. One of the significant examples of its potential use includes assisting patients without voices.

He gave this example while acknowledging the company’s technology as a significant step towards succeeding in developing modified human-machine interfaces. Leo Zou also said that the technology could make it easier for a mother to configure an audiobook reader by just using her voice.

Baidu’s projects that the technology will allow the production of original digital content. For instance, many video game characters will be in a better position to acquire distinctive voices thanks to this technological breakthrough.

Leo Zou added that the voice-replicating technology could come in handy in speech-to-speech translation since the synthesizer can learn to imitate the speaker identity in a different language. In addition, if you want to delve deeper into this topic, try reading a paper that describes such work.

Source Digitaltrends

Subscribe to our newsletter

Signup today for free and be the first to get notified on the latest news and insights on artificial intelligence

KC Cheung
KC Cheung
KC Cheung has over 18 years experience in the technology industry including media, payments, and software and has a keen interest in artificial intelligence, machine learning, deep learning, neural networks and its applications in business. Over the years he has worked with some of the leading technology companies, building and growing dynamic teams in a fast moving international environment.
- Advertisment -


How can AI be leveraged for Clinical Trial Prediction?

Successful clinical trials occur at a ratio of 1 in 10, costing around $2-$3 billion, with drugs taking 10–12 years to be approved. High costs,...