Error in readme?
#6
by
CHNtentes
- opened
"specifically, we prune model embedding size, number of attention heads, and MLP intermediate dimension"
However, the number of attention heads are both 32 for this model and Llama 3.1. Same for kv heads.
Good catch, fixed.
srvm
changed discussion status to
closed