Further developments in the many voices required to make global work. Can we learn user experience requirements this way?
Baidu’s new system can learn to imitate every accent
One AI, 10,000 different characters by Ben Popper
At the start of this year, Chinese search giant Baidu introduced a new system called DeepVoice. It uses deep learning, a popular artificial intelligence technique, to build a system that can convert text-to-speech. The first version was able to produce short sentences that, at least on a cursory listen, were nearly indistinguishable from a real person. That system could learn one voice at a time, and required hours of data to master each one.
DeepVoice 2, which debuted in May, could imitate a voice with just half an hour of data, and a single system could learn hundreds of different accents. Today, Baidu is introducing the third and final version of DeepVoice; the company says this version can learn 10,000 voices with just a half an hour of data each. Baidu says that “having a system that is able to effectively generate a wide variety of voices opens the door to many use cases that would otherwise not be feasible. For example, each character in an audio book or a video game would have his or her own unique voice for a more enhanced user experience.” .... '
Wednesday, October 25, 2017
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment