You responded to me last time so I thought it might be worth a shot again: how do you evaluate models with like the benchmark thingies?
· Sign up or log in to comment