Adding Evaluation Results

#6
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -29,4 +29,17 @@ The final perplexity on the test set is `13.6`.
29
  archivePrefix={arXiv},
30
  primaryClass={cs.CL}
31
  }
32
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  archivePrefix={arXiv},
30
  primaryClass={cs.CL}
31
  }
32
+ ```
33
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
34
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ai-forever__rugpt3large_based_on_gpt2)
35
+
36
+ | Metric | Value |
37
+ |-----------------------|---------------------------|
38
+ | Avg. | 25.98 |
39
+ | ARC (25-shot) | 22.61 |
40
+ | HellaSwag (10-shot) | 32.84 |
41
+ | MMLU (5-shot) | 24.9 |
42
+ | TruthfulQA (0-shot) | 43.39 |
43
+ | Winogrande (5-shot) | 53.12 |
44
+ | GSM8K (5-shot) | 0.3 |
45
+ | DROP (3-shot) | 4.72 |