Why did num_key_value_heads change from 16 to 8?

#1
by codys12 - opened

Was the previous version of the model even usable or was it bugged out?

Problem after config change: RuntimeError: shape '[25, 1024, 8, 128]' is invalid for input of size 52428800

Sign up or log in to comment