Run SFT using PEFT got error with RotaryEmbedding
#34
by
Andcircle
- opened
Trying to run SFT using PEFT as here https://gist.github.com/pacman100/1731b41f7a90a87b457e8c5415ff1c14
RotaryEmbedding give error:
return (q * cos) + (rotate_half(q) * sin), (k * cos) + (rotate_half(k) * sin)
TypeError: unsupported operand type(s) for *: 'Tensor' and 'NoneType'
Changed this line https://huggingface.co/tiiuae/falcon-7b/blob/2f5c3cd4eace6be6c0f12981f377fb35e5bf6ee5/modelling_RW.py#L73 to:
if seq_len != self.seq_len_cached or self.cos_cached is None or self.sin_cached is None:
then it works.
But I can't find any places where self.sin_cached has been set to None, any hints? Thanks
I use a cluster with 4 a10g,
CUDA version 12.0
torch version 2.0.1-cu118
+1