Challenges using Hugging Face example, line "speaker_embeddings = np.load("xvector_speaker_embedding.npy")"

#1
by hlia - opened

Hi there,

I'm trying to use the example code for speecht5_vc on the model card. I'm running into issues when I get to the line:
speaker_embeddings = np.load("xvector_speaker_embedding.npy")

Where I get error:
FileNotFoundError: [Errno 2] No such file or directory: 'xvector_speaker_embedding.npy'

I'm not sure if this because I have an incorrect numpy version or if there is another issue. Please let me know how I might get around this.

The speaker embeddings are not included in this repo, so xvector_speaker_embedding.npy is a placeholder name. You can find a more complete example in the blog post: http://hf.co/blog/speecht5

In the blog, it mentioned that you can get the xvector_speaker_embedding from the dataset "Matthijs/cmu-arctic-xvectors":

embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors", split="validation")
speaker_embeddings = torch.tensor(embeddings_dataset[7306]["xvector"]).unsqueeze(0)

And then use it to generate new voice:
speech = model.generate_speech(inputs["input_values"], speaker_embeddings, vocoder=vocoder)

How to convert speech to xvector any one know about this ?

Sign up or log in to comment