bzikst/faster-whisper-large-v3-ru-podlodka

Sep 16

Hi, Alexey!
Can you please make "int8_float32" version of this Podlodka model? Just want to try it in my environment.
What is your opinion on Podlodka vs Antony66 finetuned model?

bzikst

Owner Sep 17

Hi, Max.
Can you please explain what do you mean by int8_float32 version? I can pass --quantization param to ct2 converter and it's either int8 or float32.
In my experience, Antony66's model performs just better than pure Whisper. Instead, Podlodka's model generates different output with missing punctuation in some cases. It can improve WER but is adding some new mistakes. So, my personal opinion is to use Antony66's model.

saintman

Sep 17

Thanks for comparison.

You can check quantization option here: https://github.com/OpenNMT/CTranslate2/blob/master/docs/quantization.md
F.i. your conversion of Antony66's model with pure "int8" option gives me the following warning on loading this model:
"[ctranslate2] [thread 158] [warning] The compute type inferred from the saved model is int8_bfloat16, but the target device or backend do not support efficient int8_bfloat16 computation. The model weights have been automatically converted to use the int8_float32 compute type instead."
It still loads and works OK.

bzikst

Owner Sep 17

Thank you, now I see how it works here. Model conversion is on it's way.

bzikst

Owner Sep 17

Here you go: https://huggingface.co/bzikst/faster-whisper-large-v3-ru-podlodka-int8

saintman

Sep 17

Thanks a lot! Will try it.

saintman

Sep 17

At first glance works like a charm, thanks!

bzikst
/

faster-whisper-large-v3-ru-podlodka

Podlodka int8 version