metadata
datasets: []
language: []
library_name: sentence-transformers
metrics:
- cosine_accuracy
- dot_accuracy
- manhattan_accuracy
- euclidean_accuracy
- max_accuracy
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:32621
- loss:TripletLoss
widget:
- source_sentence: >-
6.2 either party may terminate this agreement for cause if the other party
fails to perform any material provision of this agreement or commits a
material breach of this agreement which is not corrected within [***]
after receiving written notice of the failure or breach. except that if
the default is by 6 supplier that creates an immediate public food safety
risk, pnc may terminate this agreement immediately without regard to any
period for correction.
sentences:
- what constitutes a material violation under the default provision?
- >-
8.3 termination for cause. this agreement may be terminated by a party
---------------------- for cause immediately upon the occurrence of and
in accordance with the following: (a) insolvency event. either may
terminate this agreement by delivering written notice to the other party
upon the occurrence of any of the following events: (i) a receiver is
appointed for either party or its property; (ii) either makes a general
assignment for the benefit of its creditors; (iii) either party
commences, or has commenced against it, proceedings under any
bankruptcy, insolvency or debtor's relief law, which proceedings are not
dismissed within sixty (60) days; or (iv) either party is liquidated or
source: rae systems inc, 10-q, 11/14/2000 dissolved. (b) change of
control. in the event more that there is a change in ownership
representing fifty percent (50%) or more of the equity ownership of
either party, the other party may, at its option, terminate this
agreement upon written notice. (c) default. either party may terminate
this agreement effective upon written notice to the other if the other
party violates any covenant, agreement, representation or warranty
contained herein in any material respect or defaults or fails to perform
any of its obligations or agreements hereunder in any material respect,
which violation, default or failure is not cured within thirty (30) days
after notice thereof from the non-defaulting party stating its intention
to terminate this agreement by reason thereof.
- does chinese law supersede international regulations?
- source_sentence: >-
(a) member specifically acknowledges that, pursuant to the franchise
agreement, and by virtue of its position with franchisee, member will
receive valuable specialized training and confidential information,
including, without limitation, information regarding the operational,
sales, promotional, and marketing methods and techniques of franchisor and
the system.
sentences:
- >-
1. confidential information. member shall not, during the term of the
franchise agreement or thereafter, communicate, divulge or use, for any
purpose other than the operation of the franchised business, any
confidential information, knowledge, trade secrets or know-how which may
be communicated to member or which member may learn by virtue of
member's relationship with franchisee. all information, knowledge and
know-how relating to franchisor, its business plans, franchised
businesses, or the system ("confidential information") is deemed
confidential, except for information that member can demonstrate came to
member's attention by lawful means prior to disclosure to member; or
which, at the time of the disclosure to member, had become a part of the
public domain.
- >-
can the member use trade secrets for purposes outside of operating the
franchised business?
- is written consent from party a mandatory for party b's assignment?
- source_sentence: >-
ad networks we may feature advertising within our service. the advertisers
may collect and use information about you, such as your service session
activity, device identifier, mac address, imei, geo-location information
and ip address. they may use this information to provide advertisements of
interest to you. please refer to our list of partners within the services
and for more information on how to opt out at:
http://www.supercell.net/partner-opt-out.
sentences:
- >-
what is the designation for the type of data that pertains to a person's
confidential and unique identifiers, including their electronic mail
details and connections within online platforms?
- which entities constitute 'ad partners' as mentioned in the clause?
- >-
how we use data collection tools and online advertising under armour
uses cookies and other data collection tools like web beacons to collect
data that help us personalize your use of our websites and mobile
applications. we also work with a variety of advertisers, advertising
networks, advertising servers, and analytics companies ("ad partners")
that use various technologies including cookies to collect data about
your use of the services (such as pages visited, ads viewed or clicked
on) so that we and our ad partners deliver ads to you based on your
interests and online activities.
- source_sentence: >-
third-party vendors, including google, use cookies to serve ads based on a
user's prior visits to our website and other websites. google's use of
advertising cookies enables it and its partners to serve ads based on
visits to our site or other sites on the internet. you can opt out of
personalized advertising by visiting google's ads settings. alternately,
you can opt out of other third-party vendors' uses of cookies by visiting
the digital advertising alliance's (daa) opt out page at
http://www.aboutads.info/choices or http://www.aboutads.info/appchoices.
to find out more about how google uses data it collects please visit
google privacy & principals.
sentences:
- >-
google, as a third party vendor, uses cookies to serve ads on our site.
google's use of the dart cookie enables it to serve ads to our users
based on their visit to our site and other sites on the internet. users
may opt out of the use of the dart cookie by visiting the google ad and
content network privacy policy.
- is google considered a third-party vendor in this context?
- >-
what are the obligations of the henry film and entertainment corporation
under this agreement?
- source_sentence: >-
sponsor acknowledges and agrees that, notwithstanding the grant of
exclusivity set forth in this section 4, team shall have the right to
solicit and enter into sponsorships with other parties that are not known
primarily or exclusively as suppliers or providers of any product or
service within the product and services category.
sentences:
- what constitutes a 'purchase' under the revenue-sharing agreement?
- >-
for the avoidance of doubt, the parties acknowledge that the foregoing
restriction applies only to persistent sponsorship placement as judged
by sponsor at its discretion, and not to run-of-site banner
advertisements or other rotating promotional placements.
- >-
what does 'foregoing restriction' refer to specifically within the
context of sponsorships?
model-index:
- name: SentenceTransformer
results:
- task:
type: triplet
name: Triplet
dataset:
name: all nli dev
type: all-nli-dev
metrics:
- type: cosine_accuracy
value: 0.5286745157614888
name: Cosine Accuracy
- type: dot_accuracy
value: 0.47322445879225217
name: Dot Accuracy
- type: manhattan_accuracy
value: 0.5104443600455754
name: Manhattan Accuracy
- type: euclidean_accuracy
value: 0.5142423091530574
name: Euclidean Accuracy
- type: max_accuracy
value: 0.5286745157614888
name: Max Accuracy
- task:
type: triplet
name: Triplet
dataset:
name: all nli test
type: all-nli-test
metrics:
- type: cosine_accuracy
value: 0.529054310672237
name: Cosine Accuracy
- type: dot_accuracy
value: 0.470945689327763
name: Dot Accuracy
- type: manhattan_accuracy
value: 0.5100645651348272
name: Manhattan Accuracy
- type: euclidean_accuracy
value: 0.515381693885302
name: Euclidean Accuracy
- type: max_accuracy
value: 0.529054310672237
name: Max Accuracy
SentenceTransformer
This is a sentence-transformers model trained. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("kperkins411/multi-qa-MiniLM-L6-cos-v1_triplet")
# Run inference
sentences = [
'sponsor acknowledges and agrees that, notwithstanding the grant of exclusivity set forth in this section 4, team shall have the right to solicit and enter into sponsorships with other parties that are not known primarily or exclusively as suppliers or providers of any product or service within the product and services category.',
"what does 'foregoing restriction' refer to specifically within the context of sponsorships?",
'for the avoidance of doubt, the parties acknowledge that the foregoing restriction applies only to persistent sponsorship placement as judged by sponsor at its discretion, and not to run-of-site banner advertisements or other rotating promotional placements.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Dataset:
all-nli-dev
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.5287 |
dot_accuracy | 0.4732 |
manhattan_accuracy | 0.5104 |
euclidean_accuracy | 0.5142 |
max_accuracy | 0.5287 |
Triplet
- Dataset:
all-nli-test
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.5291 |
dot_accuracy | 0.4709 |
manhattan_accuracy | 0.5101 |
euclidean_accuracy | 0.5154 |
max_accuracy | 0.5291 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 32,621 training samples
- Columns:
negative
,anchor
, andpositive
- Approximate statistics based on the first 1000 samples:
negative anchor positive type string string string details - min: 6 tokens
- mean: 80.74 tokens
- max: 512 tokens
- min: 5 tokens
- mean: 17.19 tokens
- max: 167 tokens
- min: 6 tokens
- mean: 101.64 tokens
- max: 512 tokens
- Samples:
negative anchor positive c. the obligations specified in this article shall not apply to information for which the receiving party can reasonably demonstrate that such information: iii. becomes known to the receiving party through disclosure by sources other than the disclosing party, having a right to disclose such information,
what safeguards are in place to protect the information obtained from third-party sources?
information we collect from other sources we may also receive information from other sources and combine that with information we collect through our services. for example: if you choose to link, create, or log in to your uber account with a payment provider (e.g., google wallet) or social media service (e.g., facebook), or if you engage with a separate app or website that uses our api (or whose api we use), we may receive information about you or your connections from that site or app.
3.2 manufacturing standards the manufacturer covenants that it is and will remain for the term of this agreement in compliance with all international standards in production and manufacturing.
is there a guarantee from the manufacturers regarding the conformity of the items to the mutually approved written standards for a certain duration?
each of the suppliers warrants that the products shall comply with the specifications and documentation agreed by the relevant supplier and the company in writing that is applicable to such products for the warranty period.
planetcad hereby grants to dassault systemes a fully-paid, non-exclusive, worldwide, revocable limited license to the server software and infrastructure for the sole purpose of (i) hosting the co-branded service and (ii) fulfilling itsobligations under this agreement.
what type of authorization has the video conferencing service provided to the british virgin islands-based entity and its associated organization regarding their intellectual property, with respect to the customized software and web platform, including the conditions for customer access to enhanced functionalities that incur additional charges?
skype hereby grants to online bvi and the company a limited, non-exclusive, non-sublicensable (except as set forth herein), non-transferable, non-assignable (except as provided in section 14.4), royalty-free (but subject to the provisions of section 5), license during the term to use, market, provide access to, promote, reproduce and display the skype intellectual property solely (i) as incorporated in the company-skype branded application and/or the company-skype toolbar, and (ii) as incorporated in, for the development of, and for transmission pursuant to this agreement of, the company-skype branded content and the company-skype branded web site, in each case for the sole purposes (unless otherwise mutually agreed by the parties) of promoting and distributing, pursuant to this agreement, the company-skype branded application, the company-skype toolbar, the company-skype branded content and the company-skype branded web site in the territory; (a) provided, that it is understood that the company-skype branded customers will have the right under the eula to use the company- skype branded application and the company-skype toolbar and will have the right to access the company-skype branded content, the company-skype branded web site and the online bvi web site through the internet and to otherwise receive support from the company anywhere in the world, and that the company shall be permitted to provide access to and reproduce and display the skype intellectual property through the internet anywhere in the world, and (b) provided further, that online bvi and the company shall ensure that no company-skype branded customer (or potential company-skype branded customer) shall be permitted to access, using the company-skype branded application or the company-skype toolbar or through the company-skype branded web site, any skype premium features requiring payment by the company-skype branded customer (or potential company-skype branded customer), including, but not limited to, skypein, skypeout, or skype plus, unless such company-skype branded customer (or potential company-skype branded customer) uses the payment methods made available by the company pursuant to section 2.5 for the purchase of such premium features.
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Evaluation Dataset
Unnamed Dataset
- Size: 2,641 evaluation samples
- Columns:
negative
,anchor
, andpositive
- Approximate statistics based on the first 1000 samples:
negative anchor positive type string string string details - min: 6 tokens
- mean: 83.63 tokens
- max: 512 tokens
- min: 6 tokens
- mean: 18.61 tokens
- max: 512 tokens
- min: 6 tokens
- mean: 98.17 tokens
- max: 512 tokens
- Samples:
negative anchor positive this agreement shall be governed by, and construed in accordance with the law of the state of new york.
are there any exceptions to the governing law stated?
this agreement shall be governed by the laws of the state of california, without regard to the conflicts of law provisions of any jurisdiction.
you consent to the third party use, sharing and transfer of your personal information (both inside and outside of your jurisdiction) as described in this section. these third parties will use personal information to provide services to us and for their own internal use, including analytics use. we allow third parties such as analytics providers and advertising partners to collect your personal information over time and across different websites or online services when you use our services.
collection of personal data legal basis?
15. notice for malaysia residents close in view of the implementation of the personal data protection act 2010 ("act"), sony mobile recognises the need to process all personal data obtained in a lawful and appropriate manner. the legal responsibility for compliance with the act lies with sony mobile, which is the "data user" under the act. compliance with this privacy policy and the act is the responsibility of all employees of sony mobile. as and when sony mobile is required to collect personal data, sony mobile and its employees must abide by the requirements of this privacy policy and the act. in the context of the act, "processing" is defined as including the collection, recording, holding or storing of personal data which includes, inter alia, nric numbers, home address and contact details.
you can prevent peel from showing you targeted ads by sending an email to [email protected] and asking to opt-out of targeted advertising. opting-out will only prevent targeted ads from being displayed so you may continue to see generic (non-targeted) ads from peel after you opt-out. for more information on interest-based ads or to stop use of tracking technologies for these purposes, go to www.aboutads.info or www.networkadvertising.org.
how does one opt out from third-party analytics providers?
when you use our services, we collect the following information: information about your device (including device model, os version and operator's name), time and date of the connection to the game and/or service, ip or mac address, international mobile equipment id (imei), android id, device mac address, cookie information. we also from time-to-time use services provided by third party companies that might collect information from you, and you can opt-out from this. follow the directions provided by our other third party analytics provider located at http://www.flurry.com/user-opt-out.html, https://help.chartboost.com/legal/privacy, http://privacy.adcolony.com/, http://info.tapjoy.com/about-tapjoy/privacy-policy/, http://sponsorpay.com/. if you "opt out" with our third party analytics providers, that action is specific to the information we collect specifically for that provider, and does not limit our ability to collect information from you, under the terms of this privacy policy, for other third parties.
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 64per_device_eval_batch_size
: 64learning_rate
: 2e-05num_train_epochs
: 4warmup_ratio
: 0.1fp16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 64per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | loss | all-nli-dev_max_accuracy | all-nli-test_max_accuracy |
---|---|---|---|---|---|
0 | 0 | - | - | 0.7235 | - |
0.1961 | 100 | 4.9029 | 3.1938 | 0.6058 | - |
0.3922 | 200 | 2.4204 | 1.5424 | 0.5507 | - |
0.5882 | 300 | 1.6076 | 1.0643 | 0.5344 | - |
0.7843 | 400 | 1.3142 | 0.8831 | 0.5351 | - |
0.9804 | 500 | 1.1919 | 0.7455 | 0.5435 | - |
1.1745 | 600 | 1.0824 | 0.6599 | 0.5427 | - |
1.3706 | 700 | 0.963 | 0.6360 | 0.5518 | - |
1.5667 | 800 | 0.8922 | 0.6131 | 0.5397 | - |
1.7627 | 900 | 0.8417 | 0.5900 | 0.5302 | - |
1.9588 | 1000 | 0.8165 | 0.5662 | 0.5253 | - |
2.1529 | 1100 | 0.7774 | 0.5192 | 0.5177 | - |
2.3490 | 1200 | 0.7394 | 0.5158 | 0.5363 | - |
2.5451 | 1300 | 0.7003 | 0.5185 | 0.5363 | - |
2.7412 | 1400 | 0.6636 | 0.5004 | 0.5310 | - |
2.9373 | 1500 | 0.6586 | 0.4872 | 0.5302 | - |
3.1314 | 1600 | 0.6831 | 0.4687 | 0.5306 | - |
3.3275 | 1700 | 0.6494 | 0.4667 | 0.5268 | - |
3.5235 | 1800 | 0.624 | 0.4750 | 0.5321 | - |
3.7196 | 1900 | 0.6035 | 0.4735 | 0.5264 | - |
3.9157 | 2000 | 0.6136 | 0.4679 | 0.5287 | - |
3.9941 | 2040 | - | - | - | 0.5291 |
Framework Versions
- Python: 3.11.9
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.1.2+cu121
- Accelerate: 0.31.0
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}