Time stamp/
#1
by
anticope
- opened
Hello team, Amazing solution.
Just wondering how it's possible to process or build the audio to match the timestamp of the original text.
Let's consider the text is based of srt or vrt?
you can achieve it with purely signal processing techniques applied to the audio, for example with "speed" from sox (https://sox.sourceforge.net/sox.html).
Additionally, in the case of synthesis, you might have two options: 1) scale durations used in the synthesis of audio; 2) condition duration prediction on some global information about expected duration.
Both options are outside of typical TTS usage and usually are not supported by the default API.
clementruhm
changed discussion status to
closed