Customizable Phone and Word Delimiters
#7
by
vishal99rkv
- opened
Hi!
Is there a way to customize the phone and word delimiters for the output of this model? I tried using the Wav2Vec2PhonemeCTCTokenizer
with this model and modified the phone_delimiter_token
and word_delimiter_token
params but it didn't seem to work. Here is my code if you are interested:
from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC, Wav2Vec2PhonemeCTCTokenizer
from datasets import load_dataset
import torch
processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-lv-60-espeak-cv-ft")
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-lv-60-espeak-cv-ft")
# I used the vocab.json file from this model only.
tokenizer = Wav2Vec2PhonemeCTCTokenizer("/content/vocab.json", phone_delimiter_token= "|", word_delimiter_token= "-")
ds = load_dataset("patrickvonplaten/librispeech_asr_dummy", "clean", split="validation")
input_values = processor(ds[0]["audio"]["array"], return_tensors="pt").input_values
with torch.no_grad():
logits = model(input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)
transcription = tokenizer.batch_decode(predicted_ids)
print(transcription)
I just receive this output: ['ɐ m æ n s ɛ d t ə ð ə j uː n ɪ v ɚ s s ɚ aɪ ɛ ɡ z ɪ s t']
As you can see, the delimiters haven't changed.
Any help is appreciated!