File size: 1,258 Bytes
3883c60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Features
* [x] πŸ”Š Text-to-audio
  * [x] πŸ—£ Text-to-speech
    * [x] 🐢 [Bark](https://github.com/suno-ai/bark)
      * [x] πŸ—£ Speech generation
      * [x] 🧬 Voice cloning
        * [x] πŸ‘ Basic voice cloning
        * [x] 🧬 [Accurate voice cloning](https://github.com/gitmylo/bark-voice-cloning-HuBERT-quantizer)
      * [x] 🀣 Disable stopping token option to let the AI decide how it wants to continue
  * [x] 🎡 [AudioLDM](https://github.com/haoheliu/AudioLDM) text-to-audio generation
  * [x] 🎡 [AudioCraft](https://github.com/facebookresearch/audiocraft) text-to-audio generation
* [x] πŸ”Š Audio-to-audio
  * [x] 🐢 Bark audio-to-audio using [a custom quantizer](https://github.com/gitmylo/bark-voice-cloning-HuBERT-quantizer) to deconstruct audio for bark input
  * [x] 😎 [RVC](https://github.com/RVC-Project/Retrieval-based-voice-conversion-webui) (retrieval based voice conversion)
    * [x] 🧬 RVC training
    * [x] 🐸 [coqui-ai/TTS](https://github.com/coqui-ai/TTS) text-to-speech
* [x] 🎀 Automatic-speech-recognition
  * [x] 🎀 [Whisper](https://github.com/openai/whisper) speech recognition
* [x] πŸš€ [Extensions](extensions/index.md)
  * [x] 🐍 Python
  * [x] πŸ“œ Javascript
  * [x] πŸ–ŒοΈ Styling