The benchmark scores are completely contradictory and need to be verified.
#16
by
JesusCrist
- opened
The evaluation benchmark are exactly the same, but the scores are completely different.
The evaluation benchmark are exactly the same, but the scores are completely different.