SentenceTransformer based on Qwen/Qwen2-1.5B-instruct
This is a sentence-transformers model finetuned from Qwen/Qwen2-1.5B-instruct. It maps sentences & paragraphs to a 1536-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Qwen/Qwen2-1.5B-instruct
- Maximum Sequence Length: 32768 tokens
- Output Dimensionality: 1536 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 32768, 'do_lower_case': False}) with Transformer model: Qwen2Model
(1): Pooling({'word_embedding_dimension': 1536, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("asbabiy/crm-mail-embedder-cosent")
# Run inference
sentences = [
'Mail Queue: ratehawk-b2b\nMail From: [email protected]\nMail To: [email protected]\n\nMail Subject: Ticket Closed - URGENT : Reconfirmation & HCN for ATS160057 : 139201464/Check-in date - 12 Mar 2024\n\nMail Body:\n"""\nDear Support, Your ticket - URGENT : Reconfirmation & HCN for ATS160057 : 139201464/Check-in date - 12 Mar 2024 - has been closed. We hope that the ticket was resolved to your satisfaction. If you feel that the ticket should not be closed or if the ticket has not been resolved, please reply to this email. Sincerely, Travelclub Support Team https://blue7tech-help.freshdesk.com/helpdesk/tickets/63824\n"""',
"Email category: 'TPP -- Auto template'. Email category description: 'This is an automated email from the supplier acknowledging receipt of a previous communication or providing a status update on a pending request without any specific update on the request. It solely includes a phrase indicating that the request has been acknowledged. Such emails may contain messages such as: information that the request has been taken or in process; that the ticket for the request has been created; that it is a holiday and the office hours have changed; that the company's working hours have been adjusted; that a number has been assigned to the request and updates will be provided once available; that the information has been received and transffered to the guest or hotel; or that they will contact us shortly. Also this can be message from any of our supplier stating that our account recently attempted to log in from New Browser. The purpose of this email is to let you know that your message has been received and is being handled.Email lacks personalized details specific to the recipient's situation or references to a unique order or request, which may indicate it is a generic automated response. Auto-emails are often rich with html formatting, tabular data and have a lot of tags or links.'",
"Email category: 'TPP -- Additional request of arrival time'. Email category description: 'A request from the supplier asking for the client to provide the exact or approximate check-in/arrival time as this is requested by the hotel due to different reasons. For example, the hotel does not have 24 hour reception and for this reason is asking for the arrival time. Information about the check-in helps the hotel better prepare for the guest's arrival and plan the schedule of the hotel staff.'",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1536]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 4per_device_eval_batch_size
: 4gradient_accumulation_steps
: 16learning_rate
: 1e-05num_train_epochs
: 1warmup_ratio
: 0.1bf16
: Trueload_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 4per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 16eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 1e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | loss |
---|---|---|---|
0.0031 | 5 | 1.8139 | - |
0.0062 | 10 | 1.699 | - |
0.0093 | 15 | 1.6467 | - |
0.0124 | 20 | 1.7853 | - |
0.0155 | 25 | 1.7918 | - |
0.0186 | 30 | 1.9042 | - |
0.0217 | 35 | 1.7087 | - |
0.0248 | 40 | 1.7143 | - |
0.0279 | 45 | 1.7357 | - |
0.0310 | 50 | 1.5956 | 1.6129 |
0.0341 | 55 | 1.7191 | - |
0.0372 | 60 | 1.5434 | - |
0.0403 | 65 | 1.6527 | - |
0.0434 | 70 | 1.6267 | - |
0.0465 | 75 | 1.5512 | - |
0.0497 | 80 | 1.4611 | - |
0.0528 | 85 | 1.49 | - |
0.0559 | 90 | 1.4336 | - |
0.0590 | 95 | 1.3646 | - |
0.0621 | 100 | 1.5523 | 1.4122 |
0.0652 | 105 | 1.4359 | - |
0.0683 | 110 | 1.4459 | - |
0.0714 | 115 | 1.4872 | - |
0.0745 | 120 | 1.3775 | - |
0.0776 | 125 | 1.3807 | - |
0.0807 | 130 | 1.3692 | - |
0.0838 | 135 | 1.3156 | - |
0.0869 | 140 | 1.328 | - |
0.0900 | 145 | 1.5123 | - |
0.0931 | 150 | 1.4037 | 1.3554 |
0.0962 | 155 | 1.4797 | - |
0.0993 | 160 | 1.4434 | - |
0.1024 | 165 | 1.3876 | - |
0.1055 | 170 | 1.3611 | - |
0.1086 | 175 | 1.3986 | - |
0.1117 | 180 | 1.3135 | - |
0.1148 | 185 | 1.3268 | - |
0.1179 | 190 | 1.2853 | - |
0.1210 | 195 | 1.3606 | - |
0.1241 | 200 | 1.4254 | 1.3225 |
0.1272 | 205 | 1.3152 | - |
0.1303 | 210 | 1.3482 | - |
0.1334 | 215 | 1.347 | - |
0.1365 | 220 | 1.3722 | - |
0.1396 | 225 | 1.3877 | - |
0.1428 | 230 | 1.3635 | - |
0.1459 | 235 | 1.4738 | - |
0.1490 | 240 | 1.4063 | - |
0.1521 | 245 | 1.3481 | - |
0.1552 | 250 | 1.3221 | 1.2848 |
0.1583 | 255 | 1.1117 | - |
0.1614 | 260 | 1.33 | - |
0.1645 | 265 | 1.3461 | - |
0.1676 | 270 | 1.2067 | - |
0.1707 | 275 | 1.3238 | - |
0.1738 | 280 | 1.4214 | - |
0.1769 | 285 | 1.3172 | - |
0.1800 | 290 | 1.2829 | - |
0.1831 | 295 | 1.3561 | - |
0.1862 | 300 | 1.2153 | 1.2869 |
0.1893 | 305 | 1.3482 | - |
0.1924 | 310 | 1.4491 | - |
0.1955 | 315 | 1.296 | - |
0.1986 | 320 | 1.5481 | - |
0.2017 | 325 | 1.3483 | - |
0.2048 | 330 | 1.2984 | - |
0.2079 | 335 | 1.2619 | - |
0.2110 | 340 | 1.2424 | - |
0.2141 | 345 | 1.3138 | - |
0.2172 | 350 | 1.4771 | 1.2831 |
0.2203 | 355 | 1.4589 | - |
0.2234 | 360 | 1.2647 | - |
0.2265 | 365 | 1.3268 | - |
0.2296 | 370 | 1.2185 | - |
0.2327 | 375 | 1.2264 | - |
0.2359 | 380 | 1.4256 | - |
0.2390 | 385 | 1.5409 | - |
0.2421 | 390 | 1.3106 | - |
0.2452 | 395 | 1.3129 | - |
0.2483 | 400 | 1.4063 | 1.2688 |
0.2514 | 405 | 1.1013 | - |
0.2545 | 410 | 1.3415 | - |
0.2576 | 415 | 1.4586 | - |
0.2607 | 420 | 1.2412 | - |
0.2638 | 425 | 1.3019 | - |
0.2669 | 430 | 1.2388 | - |
0.2700 | 435 | 1.3902 | - |
0.2731 | 440 | 1.3822 | - |
0.2762 | 445 | 1.2138 | - |
0.2793 | 450 | 1.4039 | 1.2490 |
0.2824 | 455 | 1.1758 | - |
0.2855 | 460 | 1.306 | - |
0.2886 | 465 | 1.4698 | - |
0.2917 | 470 | 1.2116 | - |
0.2948 | 475 | 1.2531 | - |
0.2979 | 480 | 1.3357 | - |
0.3010 | 485 | 1.1919 | - |
0.3041 | 490 | 1.3818 | - |
0.3072 | 495 | 1.2979 | - |
0.3103 | 500 | 1.2832 | 1.2466 |
0.3134 | 505 | 1.1689 | - |
0.3165 | 510 | 1.2198 | - |
0.3196 | 515 | 1.2775 | - |
0.3227 | 520 | 1.1344 | - |
0.3258 | 525 | 1.4492 | - |
0.3289 | 530 | 1.2328 | - |
0.3321 | 535 | 1.3306 | - |
0.3352 | 540 | 1.1076 | - |
0.3383 | 545 | 1.285 | - |
0.3414 | 550 | 1.2523 | 1.2435 |
0.3445 | 555 | 1.1712 | - |
0.3476 | 560 | 1.4021 | - |
0.3507 | 565 | 1.3476 | - |
0.3538 | 570 | 1.1485 | - |
0.3569 | 575 | 1.2621 | - |
0.3600 | 580 | 1.2829 | - |
0.3631 | 585 | 1.274 | - |
0.3662 | 590 | 1.2649 | - |
0.3693 | 595 | 1.2262 | - |
0.3724 | 600 | 1.1743 | 1.2378 |
0.3755 | 605 | 1.1773 | - |
0.3786 | 610 | 1.1977 | - |
0.3817 | 615 | 1.3976 | - |
0.3848 | 620 | 1.1817 | - |
0.3879 | 625 | 1.1928 | - |
0.3910 | 630 | 1.2338 | - |
0.3941 | 635 | 1.1803 | - |
0.3972 | 640 | 1.3811 | - |
0.4003 | 645 | 1.3125 | - |
0.4034 | 650 | 1.1878 | 1.2311 |
0.4065 | 655 | 1.4805 | - |
0.4096 | 660 | 1.1262 | - |
0.4127 | 665 | 1.1919 | - |
0.4158 | 670 | 1.2076 | - |
0.4189 | 675 | 1.2401 | - |
0.4220 | 680 | 1.3019 | - |
0.4252 | 685 | 1.3285 | - |
0.4283 | 690 | 1.1257 | - |
0.4314 | 695 | 1.2628 | - |
0.4345 | 700 | 1.1846 | 1.2354 |
0.4376 | 705 | 1.0939 | - |
0.4407 | 710 | 1.2502 | - |
0.4438 | 715 | 1.3645 | - |
0.4469 | 720 | 1.2408 | - |
0.4500 | 725 | 1.3127 | - |
0.4531 | 730 | 1.2795 | - |
0.4562 | 735 | 1.3127 | - |
0.4593 | 740 | 1.2164 | - |
0.4624 | 745 | 1.2942 | - |
0.4655 | 750 | 1.1968 | 1.2342 |
0.4686 | 755 | 1.2426 | - |
0.4717 | 760 | 1.2269 | - |
0.4748 | 765 | 1.3602 | - |
0.4779 | 770 | 1.2335 | - |
0.4810 | 775 | 1.3015 | - |
0.4841 | 780 | 1.1144 | - |
0.4872 | 785 | 1.3083 | - |
0.4903 | 790 | 1.273 | - |
0.4934 | 795 | 1.1784 | - |
0.4965 | 800 | 1.204 | 1.2348 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.44.0
- PyTorch: 2.2.0+cu121
- Accelerate: 0.33.0
- Datasets: 2.20.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
CoSENTLoss
@online{kexuefm-8847,
title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
author={Su Jianlin},
year={2022},
month={Jan},
url={https://kexue.fm/archives/8847},
}
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.