To understand why WaveNet improves on the current state of the art, it is useful to understand how text-to-speech (TTS) - or speech synthesis - systems work today. The majority of these are based on so-called concatenative TTS, which uses a large database of high-quality recordings, collected from a single voice actor over many hours. These recordings are split into tiny chunks that can then be co