SITGES-aina4-FT1 / README.md
adriansanz's picture
Add new SentenceTransformer model.
c98c0eb verified
metadata
base_model: projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base
datasets: []
language:
  - ca
library_name: sentence-transformers
license: apache-2.0
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:4173
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Queixa: Deixar constància de la vostra disconformitat per un mal servei
      (un tracte inapropiat, un temps d'espera excessiu, etc.), sense demanar
      cap indemnització.
    sentences:
      - >-
        Quin és el format de sortida del tràmit de baixa de la llicència de
        gual?
      - Quin és el tipus de venda que es realitza en els mercats setmanals?
      - Quin és el paper de la queixa en la resolució de conflictes?
  - source_sentence: >-
      L'empleat que en l'exercici de les seves tasques tingui assignada la
      funció de conducció de vehicles municipals, pot sol·licitar un ajut per
      les despeses ocasionades per a la renovació del carnet de conduir
      (certificat mèdic i administratiu).
    sentences:
      - Quin és el resultat esperat de les escoles que reben les subvencions?
      - Quin és el requisit per obtenir una autorització d'estacionament?
      - Quin és el requisit per a sol·licitar l'ajut social?
  - source_sentence: >-
      Aportació de documentació. Subvencions per finançar despeses d'hipoteca,
      subministrament i altres serveis i la manca d'ingressos de lloguer de les
      entitats culturals
    sentences:
      - Quin és el propòsit de la documentació?
      - Quin és el paper del públic assistent en el Ple Municipal?
      - >-
        Quin és el paper de l'ajuntament en la renovació del carnet de persona
        cuidadora?
  - source_sentence: >-
      la Fira de la Vila del Llibre de Sitges consistent en un conjunt de
      parades instal·lades al Passeig Marítim
    sentences:
      - >-
        Quin és el paper de la llicència de parcel·lació en la construcció
        d'edificacions?
      - >-
        Quin és l'objectiu del tràmit de participació en processos de selecció
        de personal de l'Ajuntament?
      - >-
        Quin és el lloc on es desenvolupa la Fira de la Vila del Llibre de
        Sitges?
  - source_sentence: >-
      Mitjançant aquest tràmit la persona interessada posa en coneixement de
      l'Ajuntament de Sitges l'inici d'un espectacle públic o activitat
      recreativa de caràcter extraordinari...
    sentences:
      - >-
        Quin és el paper de la persona interessada en la llicència per a
        espectacles públics o activitats recreatives de caràcter extraordinari?
      - >-
        Quin és el paper del Registre de Sol·licitants d'Habitatge amb Protecció
        Oficial en la gestió d'habitatges?
      - >-
        Quin és el tipus de familiars que es tenen en compte per l'ajut
        especial?
model-index:
  - name: BGE SITGES  CAT
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.07327586206896551
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.15732758620689655
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.21767241379310345
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.39439655172413796
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.07327586206896551
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.05244252873563218
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.043534482758620686
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03943965517241379
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.07327586206896551
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.15732758620689655
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.21767241379310345
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.39439655172413796
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.20125893142070614
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.14385604816639316
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.17098930660026063
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.07327586206896551
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.15086206896551724
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.21767241379310345
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.39439655172413796
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.07327586206896551
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.050287356321839075
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.04353448275862069
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03943965517241379
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.07327586206896551
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.15086206896551724
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.21767241379310345
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.39439655172413796
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2016207682773376
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.14438799945265474
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.1715919733142084
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.07327586206896551
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.14870689655172414
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.21120689655172414
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.40086206896551724
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.07327586206896551
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.04956896551724138
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.04224137931034483
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.04008620689655173
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.07327586206896551
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.14870689655172414
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.21120689655172414
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.40086206896551724
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2021149795452301
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.1433856732348113
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.16973847535400444
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.06896551724137931
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.14655172413793102
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.21767241379310345
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.38146551724137934
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.06896551724137931
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.048850574712643674
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.04353448275862069
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03814655172413793
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.06896551724137931
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.14655172413793102
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.21767241379310345
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.38146551724137934
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.19535554125135882
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.1398416119321293
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.16597320243564267
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.05603448275862069
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.13793103448275862
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.1939655172413793
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.36853448275862066
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.05603448275862069
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.04597701149425287
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.03879310344827586
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03685344827586207
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.05603448275862069
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.13793103448275862
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.1939655172413793
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.36853448275862066
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.18225870966588442
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.12688492063492074
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.15425908300208627
            name: Cosine Map@100

BGE SITGES CAT

This is a sentence-transformers model finetuned from projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("adriansanz/SITGES-aina4_moreseq")
# Run inference
sentences = [
    "Mitjançant aquest tràmit la persona interessada posa en coneixement de l'Ajuntament de Sitges l'inici d'un espectacle públic o activitat recreativa de caràcter extraordinari...",
    'Quin és el paper de la persona interessada en la llicència per a espectacles públics o activitats recreatives de caràcter extraordinari?',
    "Quin és el paper del Registre de Sol·licitants d'Habitatge amb Protecció Oficial en la gestió d'habitatges?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.0733
cosine_accuracy@3 0.1573
cosine_accuracy@5 0.2177
cosine_accuracy@10 0.3944
cosine_precision@1 0.0733
cosine_precision@3 0.0524
cosine_precision@5 0.0435
cosine_precision@10 0.0394
cosine_recall@1 0.0733
cosine_recall@3 0.1573
cosine_recall@5 0.2177
cosine_recall@10 0.3944
cosine_ndcg@10 0.2013
cosine_mrr@10 0.1439
cosine_map@100 0.171

Information Retrieval

Metric Value
cosine_accuracy@1 0.0733
cosine_accuracy@3 0.1509
cosine_accuracy@5 0.2177
cosine_accuracy@10 0.3944
cosine_precision@1 0.0733
cosine_precision@3 0.0503
cosine_precision@5 0.0435
cosine_precision@10 0.0394
cosine_recall@1 0.0733
cosine_recall@3 0.1509
cosine_recall@5 0.2177
cosine_recall@10 0.3944
cosine_ndcg@10 0.2016
cosine_mrr@10 0.1444
cosine_map@100 0.1716

Information Retrieval

Metric Value
cosine_accuracy@1 0.0733
cosine_accuracy@3 0.1487
cosine_accuracy@5 0.2112
cosine_accuracy@10 0.4009
cosine_precision@1 0.0733
cosine_precision@3 0.0496
cosine_precision@5 0.0422
cosine_precision@10 0.0401
cosine_recall@1 0.0733
cosine_recall@3 0.1487
cosine_recall@5 0.2112
cosine_recall@10 0.4009
cosine_ndcg@10 0.2021
cosine_mrr@10 0.1434
cosine_map@100 0.1697

Information Retrieval

Metric Value
cosine_accuracy@1 0.069
cosine_accuracy@3 0.1466
cosine_accuracy@5 0.2177
cosine_accuracy@10 0.3815
cosine_precision@1 0.069
cosine_precision@3 0.0489
cosine_precision@5 0.0435
cosine_precision@10 0.0381
cosine_recall@1 0.069
cosine_recall@3 0.1466
cosine_recall@5 0.2177
cosine_recall@10 0.3815
cosine_ndcg@10 0.1954
cosine_mrr@10 0.1398
cosine_map@100 0.166

Information Retrieval

Metric Value
cosine_accuracy@1 0.056
cosine_accuracy@3 0.1379
cosine_accuracy@5 0.194
cosine_accuracy@10 0.3685
cosine_precision@1 0.056
cosine_precision@3 0.046
cosine_precision@5 0.0388
cosine_precision@10 0.0369
cosine_recall@1 0.056
cosine_recall@3 0.1379
cosine_recall@5 0.194
cosine_recall@10 0.3685
cosine_ndcg@10 0.1823
cosine_mrr@10 0.1269
cosine_map@100 0.1543

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 6
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: False
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 6
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0.3065 5 3.3947 - - - - - -
0.6130 10 2.6401 - - - - - -
0.9195 15 2.0152 - - - - - -
0.9808 16 - 1.3404 0.1639 0.1577 0.1694 0.1503 0.1638
1.2261 20 1.4542 - - - - - -
1.5326 25 1.0135 - - - - - -
1.8391 30 0.8437 - - - - - -
1.9617 32 - 0.9436 0.1556 0.1596 0.1600 0.1467 0.1701
2.1456 35 0.7676 - - - - - -
2.4521 40 0.5126 - - - - - -
2.7586 45 0.4358 - - - - - -
2.9425 48 - 0.7852 0.1650 0.1693 0.1720 0.1511 0.1686
3.0651 50 0.4192 - - - - - -
3.3716 55 0.3429 - - - - - -
3.6782 60 0.3025 - - - - - -
3.9847 65 0.2863 0.7401 0.1646 0.1706 0.1759 0.1480 0.1694
4.2912 70 0.2474 - - - - - -
4.5977 75 0.2324 - - - - - -
4.9042 80 0.2344 - - - - - -
4.9655 81 - 0.7217 0.1663 0.1699 0.1767 0.1512 0.1696
5.2107 85 0.2181 - - - - - -
5.5172 90 0.2116 - - - - - -
5.8238 95 0.1926 - - - - - -
5.8851 96 - 0.7154 0.166 0.1697 0.1716 0.1543 0.171
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.3
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning}, 
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}