FunAudioLLM
/

SenseVoiceSmall

Model card Files Files and versions Community

游雁 commited on Jul 17

Commit

72963f9

•

1 Parent(s): 7470a36

update

Files changed (3) hide show

README.md +5 -2
README_zh.md +5 -2
demo.py +27 -0

README.md CHANGED Viewed

@@ -11,8 +11,7 @@ SenseVoice is a speech foundation model with multiple speech understanding capab
 <div align="center">
 <h4>
-<a href="https://www.modelscope.cn/studios/iic/SenseVoice"> Online Demo </a>
-｜<a href="https://fun-audio-llm.github.io/"> Homepage </a>
 ｜<a href="#What's News"> What's News </a>
 ｜<a href="#Benchmarks"> Benchmarks </a>
 ｜<a href="#Install"> Install </a>
@@ -23,6 +22,10 @@ SenseVoice is a speech foundation model with multiple speech understanding capab
 Model Zoo:
 [modelscope](https://www.modelscope.cn/models/iic/SenseVoiceSmall), [huggingface](https://huggingface.co/FunAudioLLM/SenseVoiceSmall)
 </div>

 <div align="center">
 <h4>
+<a href="https://fun-audio-llm.github.io/"> Homepage </a>
 ｜<a href="#What's News"> What's News </a>
 ｜<a href="#Benchmarks"> Benchmarks </a>
 ｜<a href="#Install"> Install </a>
 Model Zoo:
 [modelscope](https://www.modelscope.cn/models/iic/SenseVoiceSmall), [huggingface](https://huggingface.co/FunAudioLLM/SenseVoiceSmall)
+Online Demo:
+[modelscope demo](https://www.modelscope.cn/studios/iic/SenseVoice), [huggingface space](https://huggingface.co/spaces/FunAudioLLM/SenseVoice)
 </div>

README_zh.md CHANGED Viewed

@@ -10,8 +10,7 @@ SenseVoice是具有音频理解能力的音频基础模型，包括语音识别
 [//]: # (<div align="center"><img src="image/sensevoice2.png" width="700"/> </div>)
 <h4>
-<a href="https://www.modelscope.cn/studios/iic/SenseVoice"> 在线体验 </a>
-｜<a href="#What's New"> 文档主页 </a>
 ｜<a href="#核心功能"> 核心功能 </a>
 </h4>
 <h4>
@@ -23,6 +22,10 @@ SenseVoice是具有音频理解能力的音频基础模型，包括语音识别
 </h4>
 模型仓库：中国大陆用户推荐 [modelscope](https://www.modelscope.cn/models/iic/SenseVoiceSmall)，海外用户推荐 [huggingface](https://huggingface.co/FunAudioLLM/SenseVoiceSmall)
 </div>
 <a name="核心功能"></a>

 [//]: # (<div align="center"><img src="image/sensevoice2.png" width="700"/> </div>)
 <h4>
+<a href="#What's New"> 文档主页 </a>
 ｜<a href="#核心功能"> 核心功能 </a>
 </h4>
 <h4>
 </h4>
 模型仓库：中国大陆用户推荐 [modelscope](https://www.modelscope.cn/models/iic/SenseVoiceSmall)，海外用户推荐 [huggingface](https://huggingface.co/FunAudioLLM/SenseVoiceSmall)
+在线体验：
+[modelscope demo](https://www.modelscope.cn/studios/iic/SenseVoice), [huggingface space](https://huggingface.co/spaces/FunAudioLLM/SenseVoice)
 </div>
 <a name="核心功能"></a>

demo.py ADDED Viewed

	@@ -0,0 +1,27 @@

+from funasr import AutoModel
+from funasr.utils.postprocess_utils import rich_transcription_postprocess
+model_dir = "FunAudioLLM/SenseVoiceSmall"
+model = AutoModel(
+    model=model_dir,
+    vad_model="fsmn-vad",
+    vad_kwargs={"max_single_segment_time": 30000},
+    device="cuda:0",
+    hub="hf",
+)
+# en
+res = model.generate(
+    input=f"{model.model_path}/example/en.mp3",
+    cache={},
+    language="auto",  # "zn", "en", "yue", "ja", "ko", "nospeech"
+    use_itn=True,
+    batch_size_s=60,
+    merge_vad=True,  #
+    merge_length_s=15,
+)
+text = rich_transcription_postprocess(res[0]["text"])
+print(text)