|
--- |
|
language: |
|
- ja |
|
license: apache-2.0 |
|
model_type: transformer |
|
tags: |
|
- text-to-speech |
|
--- |
|
|
|
# Kotoba-Speech-v0.1 |
|
|
|
Kotoba-Speech v0.1 is a 1.2B Transformer-based speech generative model. It supports the following properties: |
|
1. Fluent text-to-speech generation in Japanese |
|
2. One-shot voice cloning through speech prompt |
|
|
|
![logo](./logo.webp) |
|
|
|
|
|
## Usage |
|
Plesae check out our HF Spaces [demo](https://huggingface.co/spaces/kotoba-tech/Kotoba-Speech?logs=build). |
|
|
|
|
|
## Model Details |
|
|
|
* **Model type**: Our model is end-to-end transformers. |
|
* **Language(s)**: Japanese |
|
* **Library**: We'll releasde our training code soon. Inference and model code are largely adopted from [metavoice](https://github.com/metavoiceio/metavoice-src). |
|
|
|
|
|
## Acknowledgements |
|
- We thank meta-voice for opensourcing their code. |
|
|
|
## License |
|
Apache License Version 2.0, January 2004 |