Evaluation set

#168
by jdjayakaran2410 - opened

Is there an evaluation set I can verify on which the llama3 was tested on for benchmark evaluation? Where the below categories were covered?

Category Count
Coding 150
Mathematical reasoning 150
Asking for Advice 150
Brainstorming 150
Classification 150
Closed Question Answering 150
Creative Writing 150
Extraction 150
Inhabiting a Character/Persona 150
Open Question Answering 150
Rewriting 150
Summarization 150

Sign up or log in to comment