jangedoo commited on
Commit
85611e5
1 Parent(s): 01dfd60

Add new SentenceTransformer model.

Browse files
Files changed (2) hide show
  1. README.md +133 -81
  2. config_sentence_transformers.json +2 -2
README.md CHANGED
@@ -17,89 +17,89 @@ tags:
17
  - sentence-similarity
18
  - feature-extraction
19
  - generated_from_trainer
20
- - dataset_size:1000
21
  - loss:MSELoss
22
- - dataset_size:5000
23
- - dataset_size:8000
24
- - dataset_size:100000
25
  widget:
26
- - source_sentence: 'The aggressive semi-employed religion workshop of Razzak, (EFP).
27
 
28
  '
29
  sentences:
30
- - 'मा ग्रिटर भेट्टाउन सकेन वा GDM प्रयोगकर्ताले कार्यान्वयन गर्न सकेन
 
31
 
32
  '
33
- - 'रज्जाकको आक्रामक अर्द्धशतक धर्मशाला, (एएफपी)।
 
34
 
35
  '
36
- - 'त्यसैले मेरो विजयपछि त्यस्तो अवस्था आउन दिनेछैन।
37
 
38
  '
39
- - source_sentence: 'The authority is being a constitutional body, it was also empowered
40
- by passing the bill from Parliament.
41
 
42
  '
43
  sentences:
44
- - 'अख्तियार संवैधानिक निकाय हुँदै हो, त्यसमा पनि संसदबाटै विधेयक पास गरेर अख्तियारलाई
45
- अधिकारसम्पन्न पनि गराइएको थियो।
 
46
 
47
  '
48
- - ' यहूदाका राजा सिदकियाहलाई उसका मानिसहरूलाई तिनीहरूका शत्रुहरूकहाँ सुम्पिनेछु
49
- जसले तिनीहरूलाई मार्न चाहन्छन्। ती सेनाहरू यरूशलेमबाट गइसकेका भएता पनि म तिनीहरूलाई
50
- बाबेलका राजाको सेनाहरूकहाँ सुम्पिनेछु।
51
 
52
  '
53
- - ' संकटकालको असर न्यायिक क्षेत्रमा मात्रै पर्दैन, समग्र मुलुकमै पर्छ।
54
-
55
- '
56
- - source_sentence: 'The two-day conference will participate in investors from China,
57
- India, Japan, the US, European countries, Britain and other countries, the Federation
58
- said.
59
 
60
  '
 
 
 
61
  sentences:
62
- - 'उनीहरूको जनजीविकाको आधार प्राकृतिक स्रोत रहेको छ।
63
-
64
- '
65
- - 'दुई दिनसम्म हुने सम्मेलनमा चीन, भारत, जापान, अमेरिका, युरोपियन देशहरू, बेलायत
66
- लगायत देशबाट लगानीकर्ताको सहभागिता गराउने महासंघले जानकारी दिएको छ
67
 
68
  '
69
- - 'नयाँ स्न्यापसट लिनका लागि यो बटन क्लिक गर्नुहोस्
 
 
 
 
 
70
 
71
  '
72
- - source_sentence: 'Mr Sankey issued a "confession" through his solicitor after Shields
73
- had been convicted but then withdrew it.
 
74
 
75
  '
76
  sentences:
77
- - 'श्री सान्कीले ढालहरू दोषी भएपछि आफ्नो समाधानकर्तामार्फत "स्वीकृति" जारी गर्नुभयो
78
- तर त्यसपछि यसलाई फिर्ता लिनुभयो।
79
 
80
  '
81
- - 'कृत्रिम रुपमा पेट्रोलियम पदार्थको मूल्य स्थिर राख्न अनुदान दिदै जाने हो भने नेपाली
82
- अर्थतन्त्र एकदिन धराशायी हुनेछ।
83
 
84
  '
85
- - 'ओली सरकारले "राष्ट्रियता-राष्ट्रवाद र" आर्थिक सम्ब्रिद्धि "-आर्थिक विकासलाई यसको
86
- प्राथमिकताको रूपमा घोषणा गरेको छ।
 
87
 
88
  '
89
- - source_sentence: 'We want to use this time to appeal to the American government
90
- to see if they can finally close this chapter.
91
 
92
  '
93
  sentences:
94
- - 'धेरैले घाउ पाए ओछ्यानमा थिए।
 
 
95
 
96
  '
97
- - 'नाम यसको अन्तरराष्ट्रिय हलको अद्वितिय डिजाइनबाट स्पष्ट रूपमा प्राप्त हुन्छ, जुन
98
- शीर्षकनियम स्���ेसबाट बनेको छ, जुन ठूलो गहिराइमा उच्च दबाब बुझ्न सक्षम छ।
99
 
100
  '
101
- - 'हामी अमेरिकी सरकारलाई अपील गर्न यसपटक प्रयोग गर्न चाहन्छौं कि उनीहरूले अन्त्यमा
102
- यो अध्याय बन्द गर्न सक्छन्।
103
 
104
  '
105
  model-index:
@@ -113,7 +113,7 @@ model-index:
113
  type: unknown
114
  metrics:
115
  - type: negative_mse
116
- value: -0.32407890539616346
117
  name: Negative Mse
118
  - task:
119
  type: translation
@@ -123,13 +123,13 @@ model-index:
123
  type: unknown
124
  metrics:
125
  - type: src2trg_accuracy
126
- value: 0.05445
127
  name: Src2Trg Accuracy
128
  - type: trg2src_accuracy
129
- value: 0.02105
130
  name: Trg2Src Accuracy
131
  - type: mean_accuracy
132
- value: 0.03775
133
  name: Mean Accuracy
134
  ---
135
 
@@ -184,9 +184,9 @@ from sentence_transformers import SentenceTransformer
184
  model = SentenceTransformer("jangedoo/all-MiniLM-L6-v2-nepali")
185
  # Run inference
186
  sentences = [
187
- 'We want to use this time to appeal to the American government to see if they can finally close this chapter.\n',
188
- 'हामी अमेरिकी सरकारलाई अपील गर्न यसपटक प्रयोग गर्न चाहन्छौं कि उनीहरूले अन्त्यमा यो अध्याय बन्द गर्न सक्छन्।\n',
189
- 'नाम यसको अन्तरराष्ट्रिय हलको अद्वितिय डिजाइनबाट स्पष्ट रूपमा प्राप्त हुन्छ, जुन शीर्षकनियम स्पेसबाट बनेको छ, जुन ठूलो गहिराइमा उच्च दबाब बुझ्न सक्षम छ।\n',
190
  ]
191
  embeddings = model.encode(sentences)
192
  print(embeddings.shape)
@@ -232,7 +232,7 @@ You can finetune this model on your own dataset.
232
 
233
  | Metric | Value |
234
  |:-----------------|:------------|
235
- | **negative_mse** | **-0.3241** |
236
 
237
  #### Translation
238
 
@@ -240,9 +240,9 @@ You can finetune this model on your own dataset.
240
 
241
  | Metric | Value |
242
  |:------------------|:-----------|
243
- | src2trg_accuracy | 0.0544 |
244
- | trg2src_accuracy | 0.021 |
245
- | **mean_accuracy** | **0.0377** |
246
 
247
  <!--
248
  ## Bias, Risks and Limitations
@@ -263,7 +263,7 @@ You can finetune this model on your own dataset.
263
  #### momo22/eng2nep
264
 
265
  * Dataset: [momo22/eng2nep](https://huggingface.co/datasets/momo22/eng2nep) at [57da8d4](https://huggingface.co/datasets/momo22/eng2nep/tree/57da8d44266896e334c1d8f2528cbbf666fbd0ca)
266
- * Size: 100,000 training samples
267
  * Columns: <code>English</code>, <code>Nepali</code>, and <code>label</code>
268
  * Approximate statistics based on the first 1000 samples:
269
  | | English | Nepali | label |
@@ -274,8 +274,8 @@ You can finetune this model on your own dataset.
274
  | English | Nepali | label |
275
  |:------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------|
276
  | <code>But with the origin of feudal practices in the Middle Ages, the practice of untouchability began, as well as discrimination against women.<br></code> | <code>तर मध्ययुगमा सामन्ती प्रथाको उद्भव भएसँगै जसरी छुवाछुत प्रथाको शुरुवात भयो, त्यसैगरी नारी प्रति पनि विभेद गरिन थालियो<br></code> | <code>[-0.05432726442813873, 0.029996933415532112, -0.008532932959496975, -0.035200122743844986, 0.008856767788529396, ...]</code> |
277
- | <code>A Pandit was found on the way to Pokhara from Baglung.<br></code> | <code>वाग्लुङ्गबाट पोखरा आउँदा बाटोमा एकजना पण्डित भेटिए।<br></code> | <code>[-0.023763148114085197, 0.0959007516503334, -0.11197677254676819, 0.10978179425001144, -0.028137238696217537, ...]</code> |
278
- | <code>He went on: "She ate a perfectly normal and healthy diet.<br></code> | <code>उनी गए: "उनले पूर्ण सामान्य र स्वस्थ आहार खाइन्।<br></code> | <code>[0.028130479156970978, 0.030386686325073242, -0.012276170775294304, 0.1316223442554474, -0.01928202621638775, ...]</code> |
279
  * Loss: [<code>MSELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
280
 
281
  ### Evaluation Dataset
@@ -291,11 +291,11 @@ You can finetune this model on your own dataset.
291
  | type | string | string | list |
292
  | details | <ul><li>min: 4 tokens</li><li>mean: 26.48 tokens</li><li>max: 213 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 63.73 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>size: 384 elements</li></ul> |
293
  * Samples:
294
- | English | Nepali | label |
295
- |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------|
296
- | <code>Chapter 3<br></code> | <code>परिच्छेद–३<br></code> | <code>[-0.049459926784038544, 0.048675183206796646, 0.016583453863859177, 0.04876156523823738, -0.020754676312208176, ...]</code> |
297
- | <code>The capability of MOF would be strengthened to enable it to efficiently play the lead role in donor coordination, and to secure support from all stakeholders in aid coordination activities.<br></code> | <code>दाताहरूको समन्वयमा नेतृत्वदायीको भूमिका निर्वाह प्रभावकारी ढंगले गर्न अर्थ मन्त्रालयको क्षमता सुदृढ गरिनेछ यसको लागि सबै सरोकारवालाबाट समर्थन प्राप्त गरिनेछ ।<br></code> | <code>[-0.06200315058231354, -0.016507938504219055, -0.029924314469099045, -0.052509162575006485, 0.07746178656816483, ...]</code> |
298
- | <code>Polimatrix, Inc. is a system integrator and total solutions provider delivering radiation and nuclear protection and detection.<br></code> | <code>पोलिमाट्रिक्स, इन्कर्पोरेटिड प्रणाली इन्टिजर र कुल समाधान प्रदायक रेडियो र आणविक संरक्षण र पत्ता लगाउने प्रणाली इन्टिजर र कुल समाधान प्रदायक हो।<br></code> | <code>[-0.0446796678006649, 0.026428330689668655, -0.09837698936462402, -0.07765442878007889, -0.020364686846733093, ...]</code> |
299
  * Loss: [<code>MSELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
300
 
301
  ### Training Hyperparameters
@@ -305,6 +305,7 @@ You can finetune this model on your own dataset.
305
  - `per_device_train_batch_size`: 64
306
  - `per_device_eval_batch_size`: 64
307
  - `learning_rate`: 2e-05
 
308
  - `warmup_ratio`: 0.1
309
  - `bf16`: True
310
  - `push_to_hub`: True
@@ -324,13 +325,14 @@ You can finetune this model on your own dataset.
324
  - `per_gpu_eval_batch_size`: None
325
  - `gradient_accumulation_steps`: 1
326
  - `eval_accumulation_steps`: None
 
327
  - `learning_rate`: 2e-05
328
  - `weight_decay`: 0.0
329
  - `adam_beta1`: 0.9
330
  - `adam_beta2`: 0.999
331
  - `adam_epsilon`: 1e-08
332
  - `max_grad_norm`: 1.0
333
- - `num_train_epochs`: 3
334
  - `max_steps`: -1
335
  - `lr_scheduler_type`: linear
336
  - `lr_scheduler_kwargs`: {}
@@ -421,35 +423,85 @@ You can finetune this model on your own dataset.
421
  - `optim_target_modules`: None
422
  - `batch_eval_metrics`: False
423
  - `eval_on_start`: False
 
424
  - `batch_sampler`: batch_sampler
425
  - `multi_dataset_batch_sampler`: proportional
426
 
427
  </details>
428
 
429
  ### Training Logs
430
- | Epoch | Step | Training Loss | loss | mean_accuracy | negative_mse |
431
- |:------:|:----:|:-------------:|:------:|:-------------:|:------------:|
432
- | 0.4 | 50 | 0.0021 | 0.0019 | 0.0111 | -0.3837 |
433
- | 0.8 | 100 | 0.002 | 0.0019 | 0.0123 | -0.3794 |
434
- | 0.4 | 50 | 0.002 | 0.0019 | 0.0130 | -0.3773 |
435
- | 0.8 | 100 | 0.002 | 0.0019 | 0.0135 | -0.3744 |
436
- | 0.3199 | 500 | 0.002 | 0.0018 | 0.0166 | -0.3597 |
437
- | 0.6398 | 1000 | 0.0019 | 0.0018 | 0.0204 | -0.3461 |
438
- | 0.9597 | 1500 | 0.0018 | 0.0017 | 0.0241 | -0.3389 |
439
- | 1.2796 | 2000 | 0.0018 | 0.0017 | 0.0273 | -0.3351 |
440
- | 1.5995 | 2500 | 0.0018 | 0.0017 | 0.0312 | -0.3302 |
441
- | 1.9194 | 3000 | 0.0018 | 0.0017 | 0.0328 | -0.3284 |
442
- | 2.2393 | 3500 | 0.0018 | 0.0017 | 0.0353 | -0.3264 |
443
- | 2.5592 | 4000 | 0.0018 | 0.0016 | 0.0374 | -0.3246 |
444
- | 2.8791 | 4500 | 0.0018 | 0.0016 | 0.0377 | -0.3241 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
445
 
446
 
447
  ### Framework Versions
448
- - Python: 3.10.12
449
  - Sentence Transformers: 3.0.1
450
- - Transformers: 4.42.4
451
- - PyTorch: 2.3.1+cu121
452
- - Accelerate: 0.32.1
453
  - Datasets: 2.21.0
454
  - Tokenizers: 0.19.1
455
 
 
17
  - sentence-similarity
18
  - feature-extraction
19
  - generated_from_trainer
20
+ - dataset_size:800000
21
  - loss:MSELoss
 
 
 
22
  widget:
23
+ - source_sentence: 'OUTDOOR SPACE: A covered porch and a deck.
24
 
25
  '
26
  sentences:
27
+ - 'नेपालमा चीन भारतबाट अवैध रूपले प्रशस्तै प्लाष्टिकका सामानहरू आइरहे पनि के कति
28
+ आउँछ भन्ने तथ्याङ्क कसैसँग छैन।
29
 
30
  '
31
+ - 'पछिल्लो समयमा बेलायतले ब्रिटिस – गोर्खा सेनामा कार्यरत भूपू सैनिकहरूलाई नागरिकता
32
+ दिने जनाएको छ।
33
 
34
  '
35
+ - 'OUTDOOR SPACE: ढाकिएको दलान डक।
36
 
37
  '
38
+ - source_sentence: 'Gunakar Aryal, station manager of Madanpokhara FM, says preparations
39
+ are underway to construct a radio station building from this amount.
40
 
41
  '
42
  sentences:
43
+ - 'उक्त अवसरमा समितीका उपाध्यक्ष सहीद हवारी, कोषाध्यक्ष ईलियास अन्सारी, सचिव ताजमा
44
+ खातुन, विश्वास सामुदायिक संस्थाका अध्यक्ष सुलेमान हवारी, युवा नेता रामकिसोर सिंह
45
+ पराग लगायतको उपस्थिती रहेको थियो
46
 
47
  '
48
+ - 'मदनपोखरा एफएमका स्टेशन मेनेजर गुणाकर अर्याल यो रकमबाट रेडियोको स्टेशन भवन निर्माण
49
+ गर्ने तयारी भइरहेको बताउँछन्।
 
50
 
51
  '
52
+ - 'एकपटक तिनीहरू माथि छन्, भण्डारले गृह पहुँचकर्ताहरूमा ठूलो छुट राखे।
 
 
 
 
 
53
 
54
  '
55
+ - source_sentence: "I will stay here, because a good opportunity for a great and growing\
56
+ \ work has been given to me now. And there are many people working against it.\
57
+ \ \n"
58
  sentences:
59
+ - 'राज्य विभागका प्रवक्ताले आफ्ना सबै कैदीहरूको भल्भकालागि FARC ���िम्मेवार राखे र
60
+ भन्नुभयो "जीवनको प्रमाण होस्टहरूको निष्कासन सुरक्षित गर्न कुनै श्रेणी प्रयासका
61
+ लागि आवश्यक र आवश्यक कदम हो।"
 
 
62
 
63
  '
64
+ - "किनभने त्यहाँ प्रभावपूर्ण कार्यको एउटा विशाल मौका हात लाग्नेवाला छ। अनि धेरैजना\
65
+ \ त्यस कार्यको विरोधमा पनि काम गर्दैछन्। \n"
66
+ - '(८) यस नियम बमोजिम इजाजतपत्रवालाहरु गाभिएको सूचना उपनियम (४) बमोजिम इजाजतपत्र
67
+ प्राप्त गर्ने संस्थाले राष्ट्रियस्तरको दैनिक पत्रिकामा प्रकाशन गर्नु पर्नेछ ।
68
+ ९. इजाजतपत्र रद्द भएको जानकारी दिनु पर्नेः ऐनको दफा १३ बमोजिम इजाजतपत्र रद्द भएमा
69
+ विभागले सोको जानकारी इजाजतपत्रवालालाई दिनु पर्नेछ ।
70
 
71
  '
72
+ - source_sentence: 'Due to the fake, the audio CDs and VCDs of foreign songs at a
73
+ very cheap rate in the open roads and markets of Marashyam Memorial Care have
74
+ started to affect the Nepalese music market.
75
 
76
  '
77
  sentences:
78
+ - 'दुवैजना गोदावरी घुमेर आएका थिए।
 
79
 
80
  '
81
+ - 'नीलो सूर्य बायोडिजेल शुद्ध तरकारी तेलबाट बनेको प्रिमियम जैविक इन्धनको प्रमुख
82
+ आपूर्तिकर्ता हो।
83
 
84
  '
85
+ - 'नक्कलीले गर्दा सक्कलीलाई मारश्याम स्मृतराजधानीका खुला सडक बजारमा अत्यन्त सस्तो
86
+ दरका विदेशी गीतका अडियो सीडी तथा भीसीडी पाइन थालेपछि त्यसको ठाडो असर नेपाली संगीत
87
+ बजारमा पर्र्न थालेको छ।
88
 
89
  '
90
+ - source_sentence: '"This was very surprising to me," said UM Professor Michael Combi.
 
91
 
92
  '
93
  sentences:
94
+ - '९) अनाजलाई भिजाएको भाँडोबाट निकालेर एक पटक सफा पानीले धोई चालनीजस्तो जालीदार
95
+ भाँडोमा खन्याउनु पर्दछ र यसमा एक घण्टा जति राखी पानी पूरा तर्केपछि मोटो कपडामा
96
+ बाँध्ने या मोटो कपडाको थैलोमा भरेर झुण्ड्याई दिने या कुनै भाँडामा राखी दिने।
97
 
98
  '
99
+ - '"यो मेरोलागि निकै आश्चर्यजनक थियो," युएम प्राध्यापक माइकल कम्बिले भन्नुभयो।
 
100
 
101
  '
102
+ - 'ऐंसेलुखर्क ३, नयाँ टोल, खोटाङ
 
103
 
104
  '
105
  model-index:
 
113
  type: unknown
114
  metrics:
115
  - type: negative_mse
116
+ value: -0.21079338621348143
117
  name: Negative Mse
118
  - task:
119
  type: translation
 
123
  type: unknown
124
  metrics:
125
  - type: src2trg_accuracy
126
+ value: 0.7323
127
  name: Src2Trg Accuracy
128
  - type: trg2src_accuracy
129
+ value: 0.5639
130
  name: Trg2Src Accuracy
131
  - type: mean_accuracy
132
+ value: 0.6480999999999999
133
  name: Mean Accuracy
134
  ---
135
 
 
184
  model = SentenceTransformer("jangedoo/all-MiniLM-L6-v2-nepali")
185
  # Run inference
186
  sentences = [
187
+ '"This was very surprising to me," said UM Professor Michael Combi.\n',
188
+ '"यो मेरोलागि निकै आश्चर्यजनक थियो," युएम प्राध्यापक माइकल कम्बिले भन्नुभयो।\n',
189
+ 'ऐंसेलुखर्क ३, नयाँ टोल, खोटाङ\n',
190
  ]
191
  embeddings = model.encode(sentences)
192
  print(embeddings.shape)
 
232
 
233
  | Metric | Value |
234
  |:-----------------|:------------|
235
+ | **negative_mse** | **-0.2108** |
236
 
237
  #### Translation
238
 
 
240
 
241
  | Metric | Value |
242
  |:------------------|:-----------|
243
+ | src2trg_accuracy | 0.7323 |
244
+ | trg2src_accuracy | 0.5639 |
245
+ | **mean_accuracy** | **0.6481** |
246
 
247
  <!--
248
  ## Bias, Risks and Limitations
 
263
  #### momo22/eng2nep
264
 
265
  * Dataset: [momo22/eng2nep](https://huggingface.co/datasets/momo22/eng2nep) at [57da8d4](https://huggingface.co/datasets/momo22/eng2nep/tree/57da8d44266896e334c1d8f2528cbbf666fbd0ca)
266
+ * Size: 800,000 training samples
267
  * Columns: <code>English</code>, <code>Nepali</code>, and <code>label</code>
268
  * Approximate statistics based on the first 1000 samples:
269
  | | English | Nepali | label |
 
274
  | English | Nepali | label |
275
  |:------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------|
276
  | <code>But with the origin of feudal practices in the Middle Ages, the practice of untouchability began, as well as discrimination against women.<br></code> | <code>तर मध्ययुगमा सामन्ती प्रथाको उद्भव भएसँगै जसरी छुवाछुत प्रथाको शुरुवात भयो, त्यसैगरी नारी प्रति पनि विभेद गरिन थालियो<br></code> | <code>[-0.05432726442813873, 0.029996933415532112, -0.008532932959496975, -0.035200122743844986, 0.008856767788529396, ...]</code> |
277
+ | <code>A Pandit was found on the way to Pokhara from Baglung.<br></code> | <code>वाग्लुङ्गबाट पोखरा आउँदा बाटोमा एकजना पण्डित भेटिए।<br></code> | <code>[-0.023763157427310944, 0.09590080380439758, -0.11197677254676819, 0.10978180170059204, -0.028137221932411194, ...]</code> |
278
+ | <code>He went on: "She ate a perfectly normal and healthy diet.<br></code> | <code>उनी गए: "उनले पूर्ण सामान्य र स्वस्थ आहार खाइन्।<br></code> | <code>[0.028130438178777695, 0.03038676083087921, -0.012276142835617065, 0.1316222846508026, -0.01928197592496872, ...]</code> |
279
  * Loss: [<code>MSELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
280
 
281
  ### Evaluation Dataset
 
291
  | type | string | string | list |
292
  | details | <ul><li>min: 4 tokens</li><li>mean: 26.48 tokens</li><li>max: 213 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 63.73 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>size: 384 elements</li></ul> |
293
  * Samples:
294
+ | English | Nepali | label |
295
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------|
296
+ | <code>Chapter 3<br></code> | <code>परिच्छेद–३<br></code> | <code>[-0.04945989325642586, 0.048675231635570526, 0.016583407297730446, 0.048761602491140366, -0.020754696801304817, ...]</code> |
297
+ | <code>The capability of MOF would be strengthened to enable it to efficiently play the lead role in donor coordination, and to secure support from all stakeholders in aid coordination activities.<br></code> | <code>दाताहरूको समन्वयमा नेतृत्वदायीको भूमिका निर्वाह प्रभावकारी ढंगले गर्न अर्थ मन्त्रालयको क्षमता सुदृढ गरिनेछ यसको लागि सबै सरोकारवालाबाट समर्थन प्राप्त गरिनेछ ।<br></code> | <code>[-0.06200314313173294, -0.016507906839251518, -0.029924260452389717, -0.05250919610261917, 0.07746176421642303, ...]</code> |
298
+ | <code>Polimatrix, Inc. is a system integrator and total solutions provider delivering radiation and nuclear protection and detection.<br></code> | <code>पोलिमाट्रिक्स, इन्कर्पोरेटिड प्रणाली इन्टिजर र कुल समाधान प्रदायक रेडियो र आणविक संरक्षण र पत्ता लगाउने प्रणाली इन्टिजर र कुल समाधान प्रदायक हो।<br></code> | <code>[-0.0446796789765358, 0.02642829343676567, -0.09837698936462402, -0.07765442132949829, -0.02036469243466854, ...]</code> |
299
  * Loss: [<code>MSELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
300
 
301
  ### Training Hyperparameters
 
305
  - `per_device_train_batch_size`: 64
306
  - `per_device_eval_batch_size`: 64
307
  - `learning_rate`: 2e-05
308
+ - `num_train_epochs`: 5
309
  - `warmup_ratio`: 0.1
310
  - `bf16`: True
311
  - `push_to_hub`: True
 
325
  - `per_gpu_eval_batch_size`: None
326
  - `gradient_accumulation_steps`: 1
327
  - `eval_accumulation_steps`: None
328
+ - `torch_empty_cache_steps`: None
329
  - `learning_rate`: 2e-05
330
  - `weight_decay`: 0.0
331
  - `adam_beta1`: 0.9
332
  - `adam_beta2`: 0.999
333
  - `adam_epsilon`: 1e-08
334
  - `max_grad_norm`: 1.0
335
+ - `num_train_epochs`: 5
336
  - `max_steps`: -1
337
  - `lr_scheduler_type`: linear
338
  - `lr_scheduler_kwargs`: {}
 
423
  - `optim_target_modules`: None
424
  - `batch_eval_metrics`: False
425
  - `eval_on_start`: False
426
+ - `eval_use_gather_object`: False
427
  - `batch_sampler`: batch_sampler
428
  - `multi_dataset_batch_sampler`: proportional
429
 
430
  </details>
431
 
432
  ### Training Logs
433
+ | Epoch | Step | Training Loss | loss | mean_accuracy | negative_mse |
434
+ |:------:|:-----:|:-------------:|:------:|:-------------:|:------------:|
435
+ | 0.08 | 1000 | 0.0022 | 0.0019 | 0.0132 | -0.3831 |
436
+ | 0.16 | 2000 | 0.002 | 0.0018 | 0.0184 | -0.3665 |
437
+ | 0.24 | 3000 | 0.0019 | 0.0018 | 0.0243 | -0.3511 |
438
+ | 0.32 | 4000 | 0.0019 | 0.0017 | 0.0307 | -0.3400 |
439
+ | 0.4 | 5000 | 0.0018 | 0.0017 | 0.0386 | -0.3317 |
440
+ | 0.48 | 6000 | 0.0018 | 0.0016 | 0.0504 | -0.3239 |
441
+ | 0.56 | 7000 | 0.0017 | 0.0016 | 0.0701 | -0.3148 |
442
+ | 0.64 | 8000 | 0.0017 | 0.0016 | 0.0973 | -0.3057 |
443
+ | 0.72 | 9000 | 0.0017 | 0.0015 | 0.1307 | -0.2964 |
444
+ | 0.8 | 10000 | 0.0016 | 0.0015 | 0.1672 | -0.2882 |
445
+ | 0.88 | 11000 | 0.0016 | 0.0014 | 0.2049 | -0.2802 |
446
+ | 0.96 | 12000 | 0.0016 | 0.0014 | 0.2358 | -0.2752 |
447
+ | 1.04 | 13000 | 0.0015 | 0.0014 | 0.2631 | -0.2701 |
448
+ | 1.12 | 14000 | 0.0015 | 0.0014 | 0.2896 | -0.2650 |
449
+ | 1.2 | 15000 | 0.0015 | 0.0013 | 0.3191 | -0.2606 |
450
+ | 1.28 | 16000 | 0.0015 | 0.0013 | 0.3467 | -0.2570 |
451
+ | 1.3600 | 17000 | 0.0014 | 0.0013 | 0.3674 | -0.2536 |
452
+ | 1.44 | 18000 | 0.0014 | 0.0013 | 0.3868 | -0.2502 |
453
+ | 1.52 | 19000 | 0.0014 | 0.0013 | 0.4069 | -0.2475 |
454
+ | 1.6 | 20000 | 0.0014 | 0.0013 | 0.4235 | -0.2456 |
455
+ | 1.6800 | 21000 | 0.0014 | 0.0013 | 0.4397 | -0.2433 |
456
+ | 1.76 | 22000 | 0.0014 | 0.0012 | 0.4538 | -0.2410 |
457
+ | 1.8400 | 23000 | 0.0014 | 0.0012 | 0.4630 | -0.2392 |
458
+ | 1.92 | 24000 | 0.0014 | 0.0012 | 0.4798 | -0.2374 |
459
+ | 2.0 | 25000 | 0.0014 | 0.0012 | 0.4880 | -0.2354 |
460
+ | 2.08 | 26000 | 0.0013 | 0.0012 | 0.5018 | -0.2340 |
461
+ | 2.16 | 27000 | 0.0013 | 0.0012 | 0.5097 | -0.2324 |
462
+ | 2.24 | 28000 | 0.0013 | 0.0012 | 0.5199 | -0.2305 |
463
+ | 2.32 | 29000 | 0.0013 | 0.0012 | 0.5291 | -0.2292 |
464
+ | 2.4 | 30000 | 0.0013 | 0.0012 | 0.5373 | -0.2292 |
465
+ | 2.48 | 31000 | 0.0013 | 0.0012 | 0.5487 | -0.2271 |
466
+ | 2.56 | 32000 | 0.0013 | 0.0012 | 0.5543 | -0.2259 |
467
+ | 2.64 | 33000 | 0.0013 | 0.0012 | 0.5616 | -0.2249 |
468
+ | 2.7200 | 34000 | 0.0013 | 0.0012 | 0.5698 | -0.2236 |
469
+ | 2.8 | 35000 | 0.0013 | 0.0012 | 0.5779 | -0.2225 |
470
+ | 2.88 | 36000 | 0.0013 | 0.0012 | 0.5829 | -0.2218 |
471
+ | 2.96 | 37000 | 0.0013 | 0.0011 | 0.5893 | -0.2208 |
472
+ | 3.04 | 38000 | 0.0013 | 0.0011 | 0.5947 | -0.2202 |
473
+ | 3.12 | 39000 | 0.0013 | 0.0011 | 0.5986 | -0.2195 |
474
+ | 3.2 | 40000 | 0.0013 | 0.0011 | 0.6019 | -0.2183 |
475
+ | 3.2800 | 41000 | 0.0013 | 0.0011 | 0.6076 | -0.2177 |
476
+ | 3.36 | 42000 | 0.0013 | 0.0011 | 0.6112 | -0.2173 |
477
+ | 3.44 | 43000 | 0.0013 | 0.0011 | 0.6143 | -0.2166 |
478
+ | 3.52 | 44000 | 0.0012 | 0.0011 | 0.6178 | -0.2163 |
479
+ | 3.6 | 45000 | 0.0012 | 0.0011 | 0.6225 | -0.2153 |
480
+ | 3.68 | 46000 | 0.0012 | 0.0011 | 0.6232 | -0.2148 |
481
+ | 3.76 | 47000 | 0.0012 | 0.0011 | 0.6292 | -0.2142 |
482
+ | 3.84 | 48000 | 0.0012 | 0.0011 | 0.6317 | -0.2136 |
483
+ | 3.92 | 49000 | 0.0012 | 0.0011 | 0.6323 | -0.2135 |
484
+ | 4.0 | 50000 | 0.0012 | 0.0011 | 0.634 | -0.2134 |
485
+ | 4.08 | 51000 | 0.0012 | 0.0011 | 0.6362 | -0.2129 |
486
+ | 4.16 | 52000 | 0.0012 | 0.0011 | 0.6377 | -0.2126 |
487
+ | 4.24 | 53000 | 0.0012 | 0.0011 | 0.6379 | -0.2122 |
488
+ | 4.32 | 54000 | 0.0012 | 0.0011 | 0.6413 | -0.2118 |
489
+ | 4.4 | 55000 | 0.0012 | 0.0011 | 0.6425 | -0.2117 |
490
+ | 4.48 | 56000 | 0.0012 | 0.0011 | 0.6425 | -0.2115 |
491
+ | 4.5600 | 57000 | 0.0012 | 0.0011 | 0.6454 | -0.2114 |
492
+ | 4.64 | 58000 | 0.0012 | 0.0011 | 0.6440 | -0.2112 |
493
+ | 4.72 | 59000 | 0.0012 | 0.0011 | 0.6463 | -0.2110 |
494
+ | 4.8 | 60000 | 0.0012 | 0.0011 | 0.6466 | -0.2110 |
495
+ | 4.88 | 61000 | 0.0012 | 0.0011 | 0.6465 | -0.2109 |
496
+ | 4.96 | 62000 | 0.0012 | 0.0011 | 0.6481 | -0.2108 |
497
 
498
 
499
  ### Framework Versions
500
+ - Python: 3.11.9
501
  - Sentence Transformers: 3.0.1
502
+ - Transformers: 4.44.0
503
+ - PyTorch: 2.4.0+cu121
504
+ - Accelerate: 0.33.0
505
  - Datasets: 2.21.0
506
  - Tokenizers: 0.19.1
507
 
config_sentence_transformers.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "__version__": {
3
  "sentence_transformers": "3.0.1",
4
- "transformers": "4.42.4",
5
- "pytorch": "2.3.1+cu121"
6
  },
7
  "prompts": {},
8
  "default_prompt_name": null,
 
1
  {
2
  "__version__": {
3
  "sentence_transformers": "3.0.1",
4
+ "transformers": "4.44.0",
5
+ "pytorch": "2.4.0+cu121"
6
  },
7
  "prompts": {},
8
  "default_prompt_name": null,