Whisper-WebUI

A Gradio-based browser interface for Whisper. You can use it as an Easy Subtitle Generator!

Notebook

If you wish to try this on Colab, you can do it in here!

Generate subtitles from various sources, including :
- Files
- Youtube
- Microphone
Currently supported subtitle formats :
- SRT
- WebVTT
Speech to Text Translation
- From other languages to English.

To run Whisper, you need to have git, python version 3.8 ~ 3.10 and FFmpeg.

Please follow the links below to install the necessary software:

After installing FFmpeg, make sure to add the FFmpeg/bin folder to your system PATH!

If you have satisfied the prerequisites listed above, you are now ready to start Whisper-WebUI.

Run Install.bat from Windows Explorer as a regular, non-administrator user.
After installation, run the start-webui.bat. (It will automatically download the model if it is not already installed.)
Open your web browser and go to http://localhost:7860

( If you're running another Web-UI, it will be hosted on a different port , such as localhost:7861, localhost:7862, and so on )

And you can also run the project with command line arguments if you like by running user-start-webui.bat, see wiki for a guide to arguments.

The WebUI uses the Open AI Whisper model

Size	Parameters	English-only model	Multilingual model	Required VRAM	Relative speed
tiny	39 M	`tiny.en`	`tiny`	~1 GB	~32x
base	74 M	`base.en`	`base`	~1 GB	~16x
small	244 M	`small.en`	`small`	~2 GB	~6x
medium	769 M	`medium.en`	`medium`	~5 GB	~2x
large	1550 M	N/A	`large`	~10 GB	1x

.en models are for English only, and the cool thing is that you can use the Translate to English option from the "large" models!