--- license: gemma base_model: google/gemma-2-2b tags: - trl - sft - generated_from_trainer model-index: - name: collapse_gemma-2-2b_hs2_accumulate_iter20_sftsd2 results: [] --- # collapse_gemma-2-2b_hs2_accumulate_iter20_sftsd2 This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.1047 - Num Input Tokens Seen: 103558064 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8e-06 - train_batch_size: 8 - eval_batch_size: 16 - seed: 2 - gradient_accumulation_steps: 16 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant_with_warmup - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen | |:-------------:|:------:|:----:|:---------------:|:-----------------:| | No log | 0 | 0 | 1.3909 | 0 | | 1.6145 | 0.0026 | 5 | 1.3908 | 258392 | | 1.5296 | 0.0052 | 10 | 1.3846 | 527784 | | 1.4992 | 0.0078 | 15 | 1.3659 | 798352 | | 1.5865 | 0.0104 | 20 | 1.3370 | 1073848 | | 1.5308 | 0.0130 | 25 | 1.3015 | 1339176 | | 1.3953 | 0.0156 | 30 | 1.2603 | 1603432 | | 1.2499 | 0.0182 | 35 | 1.2346 | 1872504 | | 1.2136 | 0.0208 | 40 | 1.2084 | 2133312 | | 1.0826 | 0.0234 | 45 | 1.1999 | 2397488 | | 1.0162 | 0.0260 | 50 | 1.2194 | 2664096 | | 0.9383 | 0.0286 | 55 | 1.2402 | 2934000 | | 0.9131 | 0.0312 | 60 | 1.2513 | 3203552 | | 0.7416 | 0.0338 | 65 | 1.3015 | 3475136 | | 0.5884 | 0.0363 | 70 | 1.3216 | 3747808 | | 0.7072 | 0.0389 | 75 | 1.2972 | 4013536 | | 0.4812 | 0.0415 | 80 | 1.3226 | 4287704 | | 0.3795 | 0.0441 | 85 | 1.2909 | 4550600 | | 0.4161 | 0.0467 | 90 | 1.2926 | 4821912 | | 0.3331 | 0.0493 | 95 | 1.2767 | 5088800 | | 0.2999 | 0.0519 | 100 | 1.2468 | 5357944 | | 0.2594 | 0.0545 | 105 | 1.2432 | 5628784 | | 0.3291 | 0.0571 | 110 | 1.2562 | 5892728 | | 0.4045 | 0.0597 | 115 | 1.2429 | 6159248 | | 0.1907 | 0.0623 | 120 | 1.2373 | 6430160 | | 0.1466 | 0.0649 | 125 | 1.2259 | 6699432 | | 0.2407 | 0.0675 | 130 | 1.2181 | 6964664 | | 0.2293 | 0.0701 | 135 | 1.2135 | 7243576 | | 0.29 | 0.0727 | 140 | 1.2130 | 7503336 | | 0.244 | 0.0753 | 145 | 1.2093 | 7767024 | | 0.2125 | 0.0779 | 150 | 1.2102 | 8040808 | | 0.1996 | 0.0805 | 155 | 1.2102 | 8316752 | | 0.2827 | 0.0831 | 160 | 1.2016 | 8591056 | | 0.1894 | 0.0857 | 165 | 1.2031 | 8868216 | | 0.2434 | 0.0883 | 170 | 1.1997 | 9141184 | | 0.1646 | 0.0909 | 175 | 1.2151 | 9408984 | | 0.1749 | 0.0935 | 180 | 1.2104 | 9681488 | | 0.1963 | 0.0961 | 185 | 1.1989 | 9952128 | | 0.1609 | 0.0987 | 190 | 1.2014 | 10220936 | | 0.214 | 0.1013 | 195 | 1.1991 | 10486584 | | 0.2428 | 0.1039 | 200 | 1.1955 | 10760136 | | 0.1727 | 0.1065 | 205 | 1.2001 | 11029848 | | 0.1375 | 0.1090 | 210 | 1.1985 | 11293992 | | 0.172 | 0.1116 | 215 | 1.1977 | 11566128 | | 0.152 | 0.1142 | 220 | 1.1974 | 11829832 | | 0.1509 | 0.1168 | 225 | 1.1965 | 12094136 | | 0.143 | 0.1194 | 230 | 1.1888 | 12361152 | | 0.1545 | 0.1220 | 235 | 1.1928 | 12634720 | | 0.2361 | 0.1246 | 240 | 1.1850 | 12899800 | | 0.1479 | 0.1272 | 245 | 1.1892 | 13167992 | | 0.1897 | 0.1298 | 250 | 1.2002 | 13433200 | | 0.1605 | 0.1324 | 255 | 1.1893 | 13704744 | | 0.1486 | 0.1350 | 260 | 1.1884 | 13977368 | | 0.2541 | 0.1376 | 265 | 1.1949 | 14250448 | | 0.2425 | 0.1402 | 270 | 1.1854 | 14524784 | | 0.0945 | 0.1428 | 275 | 1.1952 | 14796744 | | 0.1759 | 0.1454 | 280 | 1.1904 | 15067856 | | 0.1667 | 0.1480 | 285 | 1.1790 | 15332272 | | 0.1906 | 0.1506 | 290 | 1.1782 | 15598664 | | 0.1118 | 0.1532 | 295 | 1.1812 | 15871872 | | 0.1459 | 0.1558 | 300 | 1.1793 | 16146200 | | 0.1598 | 0.1584 | 305 | 1.1858 | 16414936 | | 0.1889 | 0.1610 | 310 | 1.1813 | 16679448 | | 0.143 | 0.1636 | 315 | 1.1808 | 16946272 | | 0.1267 | 0.1662 | 320 | 1.1762 | 17208384 | | 0.0971 | 0.1688 | 325 | 1.1740 | 17480104 | | 0.1202 | 0.1714 | 330 | 1.1750 | 17756088 | | 0.1499 | 0.1740 | 335 | 1.1793 | 18017496 | | 0.1817 | 0.1766 | 340 | 1.1807 | 18288864 | | 0.1276 | 0.1792 | 345 | 1.1808 | 18568496 | | 0.1054 | 0.1817 | 350 | 1.1788 | 18839152 | | 0.196 | 0.1843 | 355 | 1.1790 | 19104776 | | 0.0788 | 0.1869 | 360 | 1.1779 | 19374560 | | 0.1614 | 0.1895 | 365 | 1.1753 | 19646424 | | 0.147 | 0.1921 | 370 | 1.1740 | 19915536 | | 0.1591 | 0.1947 | 375 | 1.1706 | 20185544 | | 0.1701 | 0.1973 | 380 | 1.1639 | 20452728 | | 0.1152 | 0.1999 | 385 | 1.1692 | 20724672 | | 0.0937 | 0.2025 | 390 | 1.1666 | 20999368 | | 0.1136 | 0.2051 | 395 | 1.1666 | 21270536 | | 0.1473 | 0.2077 | 400 | 1.1683 | 21542456 | | 0.0988 | 0.2103 | 405 | 1.1652 | 21808152 | | 0.1653 | 0.2129 | 410 | 1.1648 | 22083264 | | 0.1072 | 0.2155 | 415 | 1.1607 | 22352088 | | 0.1493 | 0.2181 | 420 | 1.1650 | 22617352 | | 0.1181 | 0.2207 | 425 | 1.1658 | 22888264 | | 0.1462 | 0.2233 | 430 | 1.1659 | 23150728 | | 0.1219 | 0.2259 | 435 | 1.1637 | 23417784 | | 0.0878 | 0.2285 | 440 | 1.1627 | 23693912 | | 0.0834 | 0.2311 | 445 | 1.1622 | 23969288 | | 0.1681 | 0.2337 | 450 | 1.1640 | 24233528 | | 0.1106 | 0.2363 | 455 | 1.1608 | 24504912 | | 0.1374 | 0.2389 | 460 | 1.1629 | 24773872 | | 0.1636 | 0.2415 | 465 | 1.1597 | 25035632 | | 0.1747 | 0.2441 | 470 | 1.1595 | 25309520 | | 0.1054 | 0.2467 | 475 | 1.1580 | 25576448 | | 0.1926 | 0.2493 | 480 | 1.1572 | 25850856 | | 0.1268 | 0.2518 | 485 | 1.1594 | 26114304 | | 0.1415 | 0.2544 | 490 | 1.1579 | 26385096 | | 0.105 | 0.2570 | 495 | 1.1556 | 26658432 | | 0.1684 | 0.2596 | 500 | 1.1560 | 26924856 | | 0.1644 | 0.2622 | 505 | 1.1575 | 27190288 | | 0.145 | 0.2648 | 510 | 1.1595 | 27458400 | | 0.1412 | 0.2674 | 515 | 1.1537 | 27732120 | | 0.0909 | 0.2700 | 520 | 1.1511 | 28002768 | | 0.1186 | 0.2726 | 525 | 1.1605 | 28268192 | | 0.1592 | 0.2752 | 530 | 1.1553 | 28532200 | | 0.1298 | 0.2778 | 535 | 1.1520 | 28800656 | | 0.1457 | 0.2804 | 540 | 1.1565 | 29072288 | | 0.1458 | 0.2830 | 545 | 1.1538 | 29334408 | | 0.0919 | 0.2856 | 550 | 1.1509 | 29610928 | | 0.1545 | 0.2882 | 555 | 1.1534 | 29879464 | | 0.0711 | 0.2908 | 560 | 1.1530 | 30150784 | | 0.0854 | 0.2934 | 565 | 1.1542 | 30419816 | | 0.082 | 0.2960 | 570 | 1.1552 | 30681656 | | 0.0986 | 0.2986 | 575 | 1.1519 | 30951848 | | 0.1613 | 0.3012 | 580 | 1.1493 | 31217072 | | 0.142 | 0.3038 | 585 | 1.1498 | 31479648 | | 0.0784 | 0.3064 | 590 | 1.1522 | 31752600 | | 0.0931 | 0.3090 | 595 | 1.1478 | 32017424 | | 0.1137 | 0.3116 | 600 | 1.1490 | 32288192 | | 0.2075 | 0.3142 | 605 | 1.1518 | 32549672 | | 0.1036 | 0.3168 | 610 | 1.1491 | 32818992 | | 0.0535 | 0.3194 | 615 | 1.1475 | 33084800 | | 0.0943 | 0.3220 | 620 | 1.1507 | 33350136 | | 0.0635 | 0.3245 | 625 | 1.1477 | 33621160 | | 0.0634 | 0.3271 | 630 | 1.1443 | 33897880 | | 0.0652 | 0.3297 | 635 | 1.1483 | 34161280 | | 0.0787 | 0.3323 | 640 | 1.1505 | 34432176 | | 0.1225 | 0.3349 | 645 | 1.1465 | 34696144 | | 0.1316 | 0.3375 | 650 | 1.1426 | 34968016 | | 0.0636 | 0.3401 | 655 | 1.1448 | 35240144 | | 0.0588 | 0.3427 | 660 | 1.1488 | 35510360 | | 0.1409 | 0.3453 | 665 | 1.1459 | 35776864 | | 0.1758 | 0.3479 | 670 | 1.1415 | 36047768 | | 0.1451 | 0.3505 | 675 | 1.1439 | 36316344 | | 0.0978 | 0.3531 | 680 | 1.1468 | 36592496 | | 0.092 | 0.3557 | 685 | 1.1477 | 36865440 | | 0.136 | 0.3583 | 690 | 1.1439 | 37135784 | | 0.0507 | 0.3609 | 695 | 1.1376 | 37414872 | | 0.1824 | 0.3635 | 700 | 1.1425 | 37687760 | | 0.0939 | 0.3661 | 705 | 1.1445 | 37957952 | | 0.2033 | 0.3687 | 710 | 1.1433 | 38231504 | | 0.1383 | 0.3713 | 715 | 1.1362 | 38504680 | | 0.112 | 0.3739 | 720 | 1.1417 | 38782552 | | 0.1266 | 0.3765 | 725 | 1.1416 | 39051264 | | 0.1089 | 0.3791 | 730 | 1.1366 | 39323600 | | 0.0977 | 0.3817 | 735 | 1.1399 | 39591920 | | 0.1273 | 0.3843 | 740 | 1.1432 | 39862656 | | 0.1299 | 0.3869 | 745 | 1.1388 | 40127760 | | 0.1003 | 0.3895 | 750 | 1.1400 | 40392152 | | 0.1187 | 0.3921 | 755 | 1.1421 | 40660312 | | 0.1349 | 0.3947 | 760 | 1.1386 | 40934136 | | 0.1324 | 0.3972 | 765 | 1.1387 | 41198536 | | 0.0856 | 0.3998 | 770 | 1.1415 | 41466176 | | 0.1223 | 0.4024 | 775 | 1.1398 | 41731536 | | 0.1287 | 0.4050 | 780 | 1.1353 | 42004888 | | 0.1427 | 0.4076 | 785 | 1.1350 | 42280112 | | 0.1337 | 0.4102 | 790 | 1.1370 | 42541760 | | 0.0883 | 0.4128 | 795 | 1.1386 | 42811744 | | 0.0646 | 0.4154 | 800 | 1.1370 | 43085568 | | 0.1595 | 0.4180 | 805 | 1.1373 | 43358080 | | 0.1033 | 0.4206 | 810 | 1.1350 | 43628600 | | 0.1057 | 0.4232 | 815 | 1.1327 | 43897312 | | 0.1269 | 0.4258 | 820 | 1.1367 | 44172984 | | 0.1877 | 0.4284 | 825 | 1.1335 | 44446264 | | 0.135 | 0.4310 | 830 | 1.1334 | 44715128 | | 0.1446 | 0.4336 | 835 | 1.1342 | 44979672 | | 0.136 | 0.4362 | 840 | 1.1350 | 45252752 | | 0.0793 | 0.4388 | 845 | 1.1334 | 45523544 | | 0.1266 | 0.4414 | 850 | 1.1345 | 45787888 | | 0.1508 | 0.4440 | 855 | 1.1331 | 46065944 | | 0.0948 | 0.4466 | 860 | 1.1310 | 46335728 | | 0.0679 | 0.4492 | 865 | 1.1335 | 46600096 | | 0.1145 | 0.4518 | 870 | 1.1354 | 46868264 | | 0.1181 | 0.4544 | 875 | 1.1330 | 47138224 | | 0.1 | 0.4570 | 880 | 1.1341 | 47406480 | | 0.1058 | 0.4596 | 885 | 1.1375 | 47673048 | | 0.0788 | 0.4622 | 890 | 1.1360 | 47938144 | | 0.1505 | 0.4648 | 895 | 1.1370 | 48197688 | | 0.1455 | 0.4674 | 900 | 1.1322 | 48476528 | | 0.1784 | 0.4699 | 905 | 1.1338 | 48745120 | | 0.1141 | 0.4725 | 910 | 1.1341 | 49010696 | | 0.1 | 0.4751 | 915 | 1.1307 | 49280080 | | 0.0993 | 0.4777 | 920 | 1.1283 | 49553168 | | 0.1491 | 0.4803 | 925 | 1.1318 | 49813160 | | 0.0988 | 0.4829 | 930 | 1.1319 | 50081592 | | 0.0779 | 0.4855 | 935 | 1.1281 | 50347024 | | 0.0937 | 0.4881 | 940 | 1.1294 | 50612736 | | 0.0894 | 0.4907 | 945 | 1.1306 | 50881480 | | 0.1529 | 0.4933 | 950 | 1.1315 | 51151224 | | 0.1524 | 0.4959 | 955 | 1.1305 | 51423752 | | 0.1772 | 0.4985 | 960 | 1.1302 | 51688888 | | 0.0745 | 0.5011 | 965 | 1.1303 | 51947968 | | 0.092 | 0.5037 | 970 | 1.1286 | 52216872 | | 0.1154 | 0.5063 | 975 | 1.1274 | 52481664 | | 0.1414 | 0.5089 | 980 | 1.1279 | 52746376 | | 0.1522 | 0.5115 | 985 | 1.1271 | 53014560 | | 0.181 | 0.5141 | 990 | 1.1264 | 53281872 | | 0.1158 | 0.5167 | 995 | 1.1256 | 53544912 | | 0.1501 | 0.5193 | 1000 | 1.1311 | 53812728 | | 0.1235 | 0.5219 | 1005 | 1.1308 | 54076192 | | 0.1333 | 0.5245 | 1010 | 1.1258 | 54353264 | | 0.1043 | 0.5271 | 1015 | 1.1266 | 54619912 | | 0.0968 | 0.5297 | 1020 | 1.1280 | 54888160 | | 0.1537 | 0.5323 | 1025 | 1.1264 | 55157704 | | 0.1453 | 0.5349 | 1030 | 1.1245 | 55427712 | | 0.1547 | 0.5375 | 1035 | 1.1262 | 55693776 | | 0.1455 | 0.5400 | 1040 | 1.1281 | 55961112 | | 0.1729 | 0.5426 | 1045 | 1.1269 | 56224368 | | 0.1405 | 0.5452 | 1050 | 1.1255 | 56492104 | | 0.1294 | 0.5478 | 1055 | 1.1263 | 56762016 | | 0.1261 | 0.5504 | 1060 | 1.1257 | 57033768 | | 0.1117 | 0.5530 | 1065 | 1.1230 | 57300568 | | 0.0974 | 0.5556 | 1070 | 1.1243 | 57574264 | | 0.1087 | 0.5582 | 1075 | 1.1261 | 57844592 | | 0.1801 | 0.5608 | 1080 | 1.1239 | 58117224 | | 0.1311 | 0.5634 | 1085 | 1.1224 | 58389000 | | 0.1155 | 0.5660 | 1090 | 1.1247 | 58651312 | | 0.181 | 0.5686 | 1095 | 1.1244 | 58921192 | | 0.107 | 0.5712 | 1100 | 1.1230 | 59188400 | | 0.0802 | 0.5738 | 1105 | 1.1224 | 59459240 | | 0.1087 | 0.5764 | 1110 | 1.1239 | 59731216 | | 0.1252 | 0.5790 | 1115 | 1.1257 | 59997104 | | 0.098 | 0.5816 | 1120 | 1.1238 | 60259496 | | 0.1461 | 0.5842 | 1125 | 1.1217 | 60530592 | | 0.1546 | 0.5868 | 1130 | 1.1227 | 60800696 | | 0.1437 | 0.5894 | 1135 | 1.1215 | 61064464 | | 0.0833 | 0.5920 | 1140 | 1.1225 | 61333752 | | 0.1788 | 0.5946 | 1145 | 1.1233 | 61605576 | | 0.0966 | 0.5972 | 1150 | 1.1240 | 61874888 | | 0.1225 | 0.5998 | 1155 | 1.1262 | 62149776 | | 0.155 | 0.6024 | 1160 | 1.1232 | 62422320 | | 0.103 | 0.6050 | 1165 | 1.1218 | 62688400 | | 0.0873 | 0.6076 | 1170 | 1.1257 | 62951688 | | 0.0598 | 0.6102 | 1175 | 1.1237 | 63229384 | | 0.0938 | 0.6127 | 1180 | 1.1221 | 63502432 | | 0.119 | 0.6153 | 1185 | 1.1232 | 63769976 | | 0.094 | 0.6179 | 1190 | 1.1241 | 64041896 | | 0.0959 | 0.6205 | 1195 | 1.1211 | 64313584 | | 0.1195 | 0.6231 | 1200 | 1.1215 | 64573984 | | 0.1594 | 0.6257 | 1205 | 1.1232 | 64844848 | | 0.1303 | 0.6283 | 1210 | 1.1214 | 65111832 | | 0.1715 | 0.6309 | 1215 | 1.1193 | 65382416 | | 0.1249 | 0.6335 | 1220 | 1.1195 | 65652960 | | 0.0765 | 0.6361 | 1225 | 1.1219 | 65919288 | | 0.1613 | 0.6387 | 1230 | 1.1235 | 66189504 | | 0.1343 | 0.6413 | 1235 | 1.1226 | 66457344 | | 0.1503 | 0.6439 | 1240 | 1.1216 | 66722344 | | 0.1352 | 0.6465 | 1245 | 1.1196 | 66988272 | | 0.1121 | 0.6491 | 1250 | 1.1206 | 67264296 | | 0.1189 | 0.6517 | 1255 | 1.1213 | 67537264 | | 0.0498 | 0.6543 | 1260 | 1.1195 | 67803944 | | 0.1003 | 0.6569 | 1265 | 1.1209 | 68075192 | | 0.0707 | 0.6595 | 1270 | 1.1211 | 68342008 | | 0.1244 | 0.6621 | 1275 | 1.1210 | 68617864 | | 0.1136 | 0.6647 | 1280 | 1.1209 | 68881392 | | 0.1425 | 0.6673 | 1285 | 1.1218 | 69153160 | | 0.1514 | 0.6699 | 1290 | 1.1210 | 69424704 | | 0.1413 | 0.6725 | 1295 | 1.1187 | 69694064 | | 0.1359 | 0.6751 | 1300 | 1.1196 | 69953840 | | 0.085 | 0.6777 | 1305 | 1.1203 | 70222800 | | 0.1225 | 0.6803 | 1310 | 1.1228 | 70492904 | | 0.0892 | 0.6829 | 1315 | 1.1196 | 70764608 | | 0.1643 | 0.6854 | 1320 | 1.1185 | 71032944 | | 0.1677 | 0.6880 | 1325 | 1.1197 | 71295504 | | 0.0911 | 0.6906 | 1330 | 1.1186 | 71559376 | | 0.0718 | 0.6932 | 1335 | 1.1184 | 71822072 | | 0.0906 | 0.6958 | 1340 | 1.1176 | 72092416 | | 0.209 | 0.6984 | 1345 | 1.1183 | 72352696 | | 0.0958 | 0.7010 | 1350 | 1.1174 | 72621264 | | 0.1336 | 0.7036 | 1355 | 1.1179 | 72889160 | | 0.1134 | 0.7062 | 1360 | 1.1173 | 73167560 | | 0.1962 | 0.7088 | 1365 | 1.1170 | 73430904 | | 0.0822 | 0.7114 | 1370 | 1.1175 | 73704712 | | 0.1833 | 0.7140 | 1375 | 1.1182 | 73977720 | | 0.1294 | 0.7166 | 1380 | 1.1173 | 74246392 | | 0.1399 | 0.7192 | 1385 | 1.1159 | 74518120 | | 0.1332 | 0.7218 | 1390 | 1.1157 | 74785616 | | 0.2321 | 0.7244 | 1395 | 1.1169 | 75053328 | | 0.1325 | 0.7270 | 1400 | 1.1161 | 75325400 | | 0.1606 | 0.7296 | 1405 | 1.1156 | 75593912 | | 0.1603 | 0.7322 | 1410 | 1.1152 | 75853848 | | 0.1493 | 0.7348 | 1415 | 1.1148 | 76122048 | | 0.1003 | 0.7374 | 1420 | 1.1158 | 76387640 | | 0.1172 | 0.7400 | 1425 | 1.1159 | 76649064 | | 0.1409 | 0.7426 | 1430 | 1.1166 | 76914848 | | 0.1365 | 0.7452 | 1435 | 1.1154 | 77183160 | | 0.1136 | 0.7478 | 1440 | 1.1147 | 77456664 | | 0.1053 | 0.7504 | 1445 | 1.1158 | 77721632 | | 0.0607 | 0.7530 | 1450 | 1.1151 | 77991144 | | 0.1031 | 0.7555 | 1455 | 1.1143 | 78261840 | | 0.0968 | 0.7581 | 1460 | 1.1170 | 78532984 | | 0.1099 | 0.7607 | 1465 | 1.1182 | 78807552 | | 0.0997 | 0.7633 | 1470 | 1.1163 | 79073896 | | 0.1312 | 0.7659 | 1475 | 1.1127 | 79342784 | | 0.0813 | 0.7685 | 1480 | 1.1117 | 79602712 | | 0.1151 | 0.7711 | 1485 | 1.1145 | 79876296 | | 0.0893 | 0.7737 | 1490 | 1.1169 | 80149080 | | 0.0678 | 0.7763 | 1495 | 1.1147 | 80420384 | | 0.1377 | 0.7789 | 1500 | 1.1141 | 80687800 | | 0.1472 | 0.7815 | 1505 | 1.1175 | 80957888 | | 0.1372 | 0.7841 | 1510 | 1.1198 | 81218560 | | 0.1255 | 0.7867 | 1515 | 1.1145 | 81483096 | | 0.1282 | 0.7893 | 1520 | 1.1125 | 81752176 | | 0.0595 | 0.7919 | 1525 | 1.1167 | 82024112 | | 0.066 | 0.7945 | 1530 | 1.1178 | 82290712 | | 0.0919 | 0.7971 | 1535 | 1.1149 | 82561600 | | 0.0588 | 0.7997 | 1540 | 1.1152 | 82824768 | | 0.1016 | 0.8023 | 1545 | 1.1155 | 83094936 | | 0.13 | 0.8049 | 1550 | 1.1156 | 83365632 | | 0.1853 | 0.8075 | 1555 | 1.1152 | 83640184 | | 0.1485 | 0.8101 | 1560 | 1.1141 | 83906512 | | 0.0931 | 0.8127 | 1565 | 1.1142 | 84169288 | | 0.0991 | 0.8153 | 1570 | 1.1152 | 84438896 | | 0.0657 | 0.8179 | 1575 | 1.1156 | 84708296 | | 0.1306 | 0.8205 | 1580 | 1.1161 | 84978440 | | 0.0877 | 0.8231 | 1585 | 1.1142 | 85243672 | | 0.0486 | 0.8257 | 1590 | 1.1135 | 85516000 | | 0.112 | 0.8282 | 1595 | 1.1129 | 85788976 | | 0.0931 | 0.8308 | 1600 | 1.1123 | 86054672 | | 0.115 | 0.8334 | 1605 | 1.1117 | 86326688 | | 0.1467 | 0.8360 | 1610 | 1.1101 | 86587952 | | 0.0735 | 0.8386 | 1615 | 1.1116 | 86856352 | | 0.0861 | 0.8412 | 1620 | 1.1140 | 87123656 | | 0.069 | 0.8438 | 1625 | 1.1146 | 87394456 | | 0.124 | 0.8464 | 1630 | 1.1142 | 87668040 | | 0.1213 | 0.8490 | 1635 | 1.1134 | 87930952 | | 0.1185 | 0.8516 | 1640 | 1.1132 | 88204792 | | 0.1363 | 0.8542 | 1645 | 1.1128 | 88475640 | | 0.0784 | 0.8568 | 1650 | 1.1127 | 88752472 | | 0.0725 | 0.8594 | 1655 | 1.1142 | 89023072 | | 0.1274 | 0.8620 | 1660 | 1.1127 | 89292296 | | 0.0915 | 0.8646 | 1665 | 1.1098 | 89556672 | | 0.0879 | 0.8672 | 1670 | 1.1103 | 89828512 | | 0.1169 | 0.8698 | 1675 | 1.1111 | 90098800 | | 0.0754 | 0.8724 | 1680 | 1.1117 | 90368688 | | 0.0655 | 0.8750 | 1685 | 1.1119 | 90638960 | | 0.0857 | 0.8776 | 1690 | 1.1124 | 90905944 | | 0.138 | 0.8802 | 1695 | 1.1124 | 91172456 | | 0.1288 | 0.8828 | 1700 | 1.1102 | 91437872 | | 0.1326 | 0.8854 | 1705 | 1.1093 | 91707752 | | 0.079 | 0.8880 | 1710 | 1.1127 | 91967952 | | 0.1163 | 0.8906 | 1715 | 1.1132 | 92243000 | | 0.0695 | 0.8932 | 1720 | 1.1099 | 92504432 | | 0.1912 | 0.8958 | 1725 | 1.1096 | 92772120 | | 0.0578 | 0.8984 | 1730 | 1.1121 | 93036784 | | 0.1171 | 0.9009 | 1735 | 1.1132 | 93308456 | | 0.0976 | 0.9035 | 1740 | 1.1125 | 93571544 | | 0.0958 | 0.9061 | 1745 | 1.1130 | 93845376 | | 0.1002 | 0.9087 | 1750 | 1.1118 | 94112392 | | 0.1054 | 0.9113 | 1755 | 1.1107 | 94381776 | | 0.0643 | 0.9139 | 1760 | 1.1097 | 94647136 | | 0.1492 | 0.9165 | 1765 | 1.1076 | 94921256 | | 0.1253 | 0.9191 | 1770 | 1.1075 | 95191640 | | 0.0655 | 0.9217 | 1775 | 1.1091 | 95464808 | | 0.132 | 0.9243 | 1780 | 1.1091 | 95732304 | | 0.0729 | 0.9269 | 1785 | 1.1107 | 96002320 | | 0.0975 | 0.9295 | 1790 | 1.1114 | 96266072 | | 0.0819 | 0.9321 | 1795 | 1.1090 | 96525304 | | 0.103 | 0.9347 | 1800 | 1.1085 | 96796792 | | 0.0969 | 0.9373 | 1805 | 1.1087 | 97069296 | | 0.1124 | 0.9399 | 1810 | 1.1079 | 97336960 | | 0.1047 | 0.9425 | 1815 | 1.1086 | 97611064 | | 0.1063 | 0.9451 | 1820 | 1.1080 | 97877784 | | 0.0861 | 0.9477 | 1825 | 1.1076 | 98150576 | | 0.1211 | 0.9503 | 1830 | 1.1084 | 98428144 | | 0.0827 | 0.9529 | 1835 | 1.1083 | 98700080 | | 0.084 | 0.9555 | 1840 | 1.1093 | 98973312 | | 0.0717 | 0.9581 | 1845 | 1.1097 | 99239504 | | 0.1041 | 0.9607 | 1850 | 1.1103 | 99515464 | | 0.1576 | 0.9633 | 1855 | 1.1083 | 99780552 | | 0.126 | 0.9659 | 1860 | 1.1059 | 100057904 | | 0.0821 | 0.9685 | 1865 | 1.1069 | 100328264 | | 0.178 | 0.9711 | 1870 | 1.1091 | 100601592 | | 0.092 | 0.9736 | 1875 | 1.1087 | 100864400 | | 0.1448 | 0.9762 | 1880 | 1.1074 | 101130088 | | 0.0811 | 0.9788 | 1885 | 1.1068 | 101397392 | | 0.134 | 0.9814 | 1890 | 1.1078 | 101671984 | | 0.0894 | 0.9840 | 1895 | 1.1098 | 101949576 | | 0.1075 | 0.9866 | 1900 | 1.1096 | 102221592 | | 0.1162 | 0.9892 | 1905 | 1.1075 | 102486912 | | 0.1874 | 0.9918 | 1910 | 1.1071 | 102755504 | | 0.0956 | 0.9944 | 1915 | 1.1066 | 103019656 | | 0.0965 | 0.9970 | 1920 | 1.1067 | 103283104 | | 0.0978 | 0.9996 | 1925 | 1.1047 | 103558064 | ### Framework versions - Transformers 4.44.0 - Pytorch 2.4.0+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1