Locutusque
commited on
Commit
•
8c900eb
1
Parent(s):
a644db9
Update README.md
Browse files
README.md
CHANGED
@@ -72,7 +72,76 @@ tau-0.5B was trained on a diverse dataset that may contain biases and inaccuraci
|
|
72 |
| - agieval_sat_math | 1|none | 0|acc |0.2227|± |0.0281|
|
73 |
| | |none | 0|acc_norm|0.1682|± |0.0253|
|
74 |
|
75 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
76 |
|
77 |
## Usage Rights
|
78 |
tau-0.5B is released under the cc-by-sa-4.0 license, allowing for both non-commercial and commercial use. Users are required to attribute the model to its creators and share any derivative works under the same license.
|
|
|
72 |
| - agieval_sat_math | 1|none | 0|acc |0.2227|± |0.0281|
|
73 |
| | |none | 0|acc_norm|0.1682|± |0.0253|
|
74 |
|
75 |
+
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|
76 |
+
|---------------------------------------|-------|----------------|-----:|-----------|-----:|---|-----:|
|
77 |
+
|truthfulqa | 2|none | 0|acc |0.3931|± |0.0143|
|
78 |
+
|mmlu |N/A |none | 0|acc |0.3642|± |0.0040|
|
79 |
+
| - humanities |N/A |none | 5|acc |0.3320|± |0.0068|
|
80 |
+
| - formal_logic | 0|none | 5|acc |0.2619|± |0.0393|
|
81 |
+
| - high_school_european_history | 0|none | 5|acc |0.4909|± |0.0390|
|
82 |
+
| - high_school_us_history | 0|none | 5|acc |0.4167|± |0.0346|
|
83 |
+
| - high_school_world_history | 0|none | 5|acc |0.4641|± |0.0325|
|
84 |
+
| - international_law | 0|none | 5|acc |0.5537|± |0.0454|
|
85 |
+
| - jurisprudence | 0|none | 5|acc |0.4167|± |0.0477|
|
86 |
+
| - logical_fallacies | 0|none | 5|acc |0.2638|± |0.0346|
|
87 |
+
| - moral_disputes | 0|none | 5|acc |0.3757|± |0.0261|
|
88 |
+
| - moral_scenarios | 0|none | 5|acc |0.2402|± |0.0143|
|
89 |
+
| - philosophy | 0|none | 5|acc |0.3794|± |0.0276|
|
90 |
+
| - prehistory | 0|none | 5|acc |0.3426|± |0.0264|
|
91 |
+
| - professional_law | 0|none | 5|acc |0.3103|± |0.0118|
|
92 |
+
| - world_religions | 0|none | 5|acc |0.2807|± |0.0345|
|
93 |
+
| - other |N/A |none | 5|acc |0.4071|± |0.0088|
|
94 |
+
| - business_ethics | 0|none | 5|acc |0.4200|± |0.0496|
|
95 |
+
| - clinical_knowledge | 0|none | 5|acc |0.4491|± |0.0306|
|
96 |
+
| - college_medicine | 0|none | 5|acc |0.3873|± |0.0371|
|
97 |
+
| - global_facts | 0|none | 5|acc |0.3600|± |0.0482|
|
98 |
+
| - human_aging | 0|none | 5|acc |0.3498|± |0.0320|
|
99 |
+
| - management | 0|none | 5|acc |0.4854|± |0.0495|
|
100 |
+
| - marketing | 0|none | 5|acc |0.5470|± |0.0326|
|
101 |
+
| - medical_genetics | 0|none | 5|acc |0.4000|± |0.0492|
|
102 |
+
| - miscellaneous | 0|none | 5|acc |0.4291|± |0.0177|
|
103 |
+
| - nutrition | 0|none | 5|acc |0.4183|± |0.0282|
|
104 |
+
| - professional_accounting | 0|none | 5|acc |0.3582|± |0.0286|
|
105 |
+
| - professional_medicine | 0|none | 5|acc |0.3015|± |0.0279|
|
106 |
+
| - virology | 0|none | 5|acc |0.3494|± |0.0371|
|
107 |
+
| - social_sciences |N/A |none | 5|acc |0.4075|± |0.0088|
|
108 |
+
| - econometrics | 0|none | 5|acc |0.2719|± |0.0419|
|
109 |
+
| - high_school_geography | 0|none | 5|acc |0.5000|± |0.0356|
|
110 |
+
| - high_school_government_and_politics| 0|none | 5|acc |0.4611|± |0.0360|
|
111 |
+
| - high_school_macroeconomics | 0|none | 5|acc |0.4051|± |0.0249|
|
112 |
+
| - high_school_microeconomics | 0|none | 5|acc |0.3908|± |0.0317|
|
113 |
+
| - high_school_psychology | 0|none | 5|acc |0.4239|± |0.0212|
|
114 |
+
| - human_sexuality | 0|none | 5|acc |0.3893|± |0.0428|
|
115 |
+
| - professional_psychology | 0|none | 5|acc |0.3399|± |0.0192|
|
116 |
+
| - public_relations | 0|none | 5|acc |0.4455|± |0.0476|
|
117 |
+
| - security_studies | 0|none | 5|acc |0.3510|± |0.0306|
|
118 |
+
| - sociology | 0|none | 5|acc |0.5174|± |0.0353|
|
119 |
+
| - us_foreign_policy | 0|none | 5|acc |0.5500|± |0.0500|
|
120 |
+
| - stem |N/A |none | 5|acc |0.3276|± |0.0083|
|
121 |
+
| - abstract_algebra | 0|none | 5|acc |0.3000|± |0.0461|
|
122 |
+
| - anatomy | 0|none | 5|acc |0.2889|± |0.0392|
|
123 |
+
| - astronomy | 0|none | 5|acc |0.3487|± |0.0388|
|
124 |
+
| - college_biology | 0|none | 5|acc |0.3403|± |0.0396|
|
125 |
+
| - college_chemistry | 0|none | 5|acc |0.2600|± |0.0441|
|
126 |
+
| - college_computer_science | 0|none | 5|acc |0.3800|± |0.0488|
|
127 |
+
| - college_mathematics | 0|none | 5|acc |0.3300|± |0.0473|
|
128 |
+
| - college_physics | 0|none | 5|acc |0.2745|± |0.0444|
|
129 |
+
| - computer_security | 0|none | 5|acc |0.4300|± |0.0498|
|
130 |
+
| - conceptual_physics | 0|none | 5|acc |0.3447|± |0.0311|
|
131 |
+
| - electrical_engineering | 0|none | 5|acc |0.3931|± |0.0407|
|
132 |
+
| - elementary_mathematics | 0|none | 5|acc |0.3095|± |0.0238|
|
133 |
+
| - high_school_biology | 0|none | 5|acc |0.4161|± |0.0280|
|
134 |
+
| - high_school_chemistry | 0|none | 5|acc |0.2759|± |0.0314|
|
135 |
+
| - high_school_computer_science | 0|none | 5|acc |0.3100|± |0.0465|
|
136 |
+
| - high_school_mathematics | 0|none | 5|acc |0.3185|± |0.0284|
|
137 |
+
| - high_school_physics | 0|none | 5|acc |0.2517|± |0.0354|
|
138 |
+
| - high_school_statistics | 0|none | 5|acc |0.3009|± |0.0313|
|
139 |
+
| - machine_learning | 0|none | 5|acc |0.3036|± |0.0436|
|
140 |
+
|medqa_4options |Yaml |none | 5|acc |0.2687|± |0.0124|
|
141 |
+
| | |none | 5|acc_norm |0.2687|± |0.0124|
|
142 |
+
|logieval | 0|get-answer | 5|exact_match|0.3505|± |0.0120|
|
143 |
+
|gsm8k_cot | 3|strict-match | 8|exact_match|0.0690|± |0.0070|
|
144 |
+
| | |flexible-extract| 8|exact_match|0.1365|± |0.0095|
|
145 |
|
146 |
## Usage Rights
|
147 |
tau-0.5B is released under the cc-by-sa-4.0 license, allowing for both non-commercial and commercial use. Users are required to attribute the model to its creators and share any derivative works under the same license.
|