stabilityai
/

stablelm-zephyr-3b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

pvduy commited on Dec 1, 2023

Commit

3da1353

•

1 Parent(s): 18fd9a0

Update README.md

Files changed (1) hide show

README.md +28 -0

README.md CHANGED Viewed

@@ -94,6 +94,34 @@ The dataset is comprised of a mixture of open datasets large-scale datasets avai
 | Claude 2 |  - |RLHF |8.06| 91.36|
 | GPT-4 |  -| RLHF |8.99| 95.28|
 ### Training Infrastructure
 * **Hardware**: `Stable Zephyr 3B` was trained on the Stability AI cluster across 8 nodes with 8 A100 80GBs GPUs for each nodes.

 | Claude 2 |  - |RLHF |8.06| 91.36|
 | GPT-4 |  -| RLHF |8.99| 95.28|
+## Other benchmark:
+- BigBench: 0.3526.
+| Task                                                | Version | Metric                  | Value | Stderr |
+|-----------------------------------------------------|---------|-------------------------|-------|--------|
+| bigbench_causal_judgement                           | 0       | multiple_choice_grade   | 0.5316| 0.0363 |
+| bigbench_date_understanding                         | 0       | multiple_choice_grade   | 0.4363| 0.0259 |
+| bigbench_disambiguation_qa                          | 0       | multiple_choice_grade   | 0.3217| 0.0291 |
+| bigbench_dyck_languages                             | 0       | multiple_choice_grade   | 0.1450| 0.0111 |
+| bigbench_formal_fallacies_syllogisms_negation       | 0       | multiple_choice_grade   | 0.4982| 0.0042 |
+| bigbench_geometric_shapes                           | 0       | multiple_choice_grade   | 0.1086| 0.0164 |
+| bigbench_hyperbaton                                 | 0       | exact_str_match         | 0.0000| 0.0000 |
+| bigbench_logical_deduction_five_objects             | 0       | multiple_choice_grade   | 0.5232| 0.0022 |
+| bigbench_logical_deduction_seven_objects            | 0       | multiple_choice_grade   | 0.2480| 0.0193 |
+| bigbench_logical_deduction_three_objects            | 0       | multiple_choice_grade   | 0.1814| 0.0146 |
+| bigbench_movie_recommendation                       | 0       | multiple_choice_grade   | 0.4067| 0.0284 |
+| bigbench_navigate                                   | 0       | multiple_choice_grade   | 0.2580| 0.0196 |
+| bigbench_reasoning_about_colored_objects            | 0       | multiple_choice_grade   | 0.5990| 0.0155 |
+| bigbench_ruin_names                                 | 0       | multiple_choice_grade   | 0.4370| 0.0111 |
+| bigbench_salient_translation_error_detection        | 0       | multiple_choice_grade   | 0.3951| 0.0231 |
+| bigbench_snarks                                     | 0       | multiple_choice_grade   | 0.2265| 0.0133 |
+| bigbench_sports_understanding                       | 0       | multiple_choice_grade   | 0.6464| 0.0356 |
+| bigbench_temporal_sequences                         | 0       | multiple_choice_grade   | 0.5091| 0.0159 |
+| bigbench_tracking_shuffled_objects_five_objects     | 0       | multiple_choice_grade   | 0.2680| 0.0140 |
+| bigbench_tracking_shuffled_objects_seven_objects    | 0       | multiple_choice_grade   | 0.1856| 0.0110 |
+| bigbench_tracking_shuffled_objects_three_objects    | 0       | multiple_choice_grade   | 0.1269| 0.0080 |
 ### Training Infrastructure
 * **Hardware**: `Stable Zephyr 3B` was trained on the Stability AI cluster across 8 nodes with 8 A100 80GBs GPUs for each nodes.