SarwarShafee commited on
Commit
8a21c53
1 Parent(s): b922d97

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -119,7 +119,7 @@ We evaluated the models on the following datasets:
119
  #### Evaluation on English Benchmark datasets
120
  - **llama-3.2-3b** consistently outperforms **titulm-llama-3.2-3b-v2.0** across all English tasks. It achieves high scores, particularly in **MMLU**, **BoolQ**, and **Commonsense QA**, with a maximum score of 0.80 on **PIQA** in the 5-shot setting.
121
  - In contrast, **titulm-llama-3.2-3b-v2.0** underperforms on all English benchmarks, scoring much lower than the base model, especially in **Commonsense QA** and **PIQA**, with only minor improvements between 0-shot and 5-shot.
122
- - It was expected as the model trained only on Bangla datasets.
123
 
124
  | Model | Shots | MMLU | BoolQ | Commonsense QA | OpenBook QA | PIQA |
125
  |-------------------------------|---------|-------------|--------|----------------|-------------|-------|
 
119
  #### Evaluation on English Benchmark datasets
120
  - **llama-3.2-3b** consistently outperforms **titulm-llama-3.2-3b-v2.0** across all English tasks. It achieves high scores, particularly in **MMLU**, **BoolQ**, and **Commonsense QA**, with a maximum score of 0.80 on **PIQA** in the 5-shot setting.
121
  - In contrast, **titulm-llama-3.2-3b-v2.0** underperforms on all English benchmarks, scoring much lower than the base model, especially in **Commonsense QA** and **PIQA**, with only minor improvements between 0-shot and 5-shot.
122
+ - It was expected as the model was trained only on Bangla datasets.
123
 
124
  | Model | Shots | MMLU | BoolQ | Commonsense QA | OpenBook QA | PIQA |
125
  |-------------------------------|---------|-------------|--------|----------------|-------------|-------|