Baidu Research presents Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks. The biggest obstacle to building such a system thus far has been the speed of audio synthesis – previous approaches have taken minutes or hours to generate only a few seconds of speech. We solve this challenge and show that we can do audio synthesis in real-time, which amo
![Deep Voice: Real-Time Neural Text-to-Speech for Production - Baidu Research](https://cdn-ak-scissors.b.st-hatena.com/image/square/b309c33f090b27791072ea5fc552e05f83874648/height=288;version=1;width=512/http%3A%2F%2Fresearch.baidu.com%2Fwp-content%2Fuploads%2F2017%2F02%2FDeepVoice.jpg)