How can whisper return the language type?
In the example of Long-Form Transcription, pipe does not return the language type. How can return the language type?
import torch
from transformers import pipeline
from datasets import load_dataset
device = "cuda:0" if torch.cuda.is_available() else "cpu"
pipe = pipeline(
"automatic-speech-recognition",
model="openai/whisper-large-v2",
chunk_length_s=30,
device=device,
)
ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = ds[0]["audio"]
prediction = pipe(sample.copy(), batch_size=8)
for k in prediction:
print(k)
---- output ----
text
You can pass the return_language
argument to the pipeline to get the language detected for each chunk:
import torch
from transformers import pipeline
from datasets import load_dataset
device = "cuda:0" if torch.cuda.is_available() else "cpu"
pipe = pipeline(
"automatic-speech-recognition",
model="openai/whisper-large-v2",
chunk_length_s=30,
device=device,
)
ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = ds[0]["audio"]
prediction = pipe(sample), batch_size=8, return_language=True)
print(prediction)
Print Output:
{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'chunks': [{'language': 'english',
'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.'}]}
should be:
prediction = pipe(sample, batch_size=8, return_language=True)
(minor syntax error)