distill bert?
#1
by
Javierquin
- opened
Hello, I don't understand why the final model is distilled when a bert model is used as student (dccuchile/bert-base-spanish-wwm-cased) in the script.
Thanks in advance for the help