Skip to the content.

A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech

Quantizer Reconstruction

Original Speech HiFi-GAN Quantizer (1024 codes, 1 codebook) Quantizer (65536 codes, 1 codebook) Quantizer (160 codes, 4 codebook) Quantizer (160 codes, 4 codebook)

Overall TTS Examples

Reference Speaker Transformer TTS VITS (40M) VITS (100M) MQTTS (40M) MQTTS (100M) MQTTS (200M) Text————————————————-
in two thousand and eighteen there was so much um expectation on messi can he take them to the final and win it this time .
maybe don’t have enough room in front of the truck to ah get the truck out where you need it to to get you straight .
so same as with an apple watch series two or newer , you can take this ah with you when you’re swimming or even snorkeling , yes up to fifty meters .
yeah so ah all of the a i conferences are open to anyone who is capable of ah you know make you know paying for the trip and the the ticket .
but one that stands out to me was , ah , concluding the , ah , montreal protocol , ah , as a treaty that dealt with a extraordinarily significant environmental issue .
my guitar we just wanted to give a very timeless feeling with this ah album cover when i’m in my real life i’m basically always very unsophisticated .
i think it was very well planned very well executed and ah to solve those big bombs in this closed circle around and on the ship .

Extra (TODO, will add soon)

Observations