Edit model card

DunnBC22/sentence-t5-base-FT-Quora_Sentence_Similarity-LG

This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.

For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/Semantic_Similarity/Semantic%20Similarity-base.ipynb

Usage (Sentence-Transformers)

Using this model becomes easy when you have sentence-transformers installed:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('DunnBC22/sentence-t5-base-FT-Quora_Sentence_Similarity-LG')
embeddings = model.encode(sentences)
print(embeddings)

Evaluation Results

For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net

Metric Measure Value Notes
Accuracy Cosine-Similarity 85.93 Threshold: 0.8320
F1 Cosine-Similarity 82.89 Threshold: 0.8178
Precision Cosine-Similarity 77.43 -
Recall Cosine-Similarity 89.18 -
Average Precision Cosine-Similarity 87.13 -
Accuracy Manhattan-Distance 85.95 Threshold: 12.7721
F1 Manhattan-Distance 82.89 Threshold: 13.5008
Precision Manhattan-Distance 76.91 -
Recall Manhattan-Distance 89.89 -
Average Precision Manhattan-Distance 87.13 -
Accuracy Euclidean-Distance 85.93 Threshold: 0.5797
F1 Euclidean-Distance 82.89 Threshold: 0.6037
Precision Euclidean-Distance 77.43 -
Recall Euclidean-Distance 89.18 -
Average Precision Euclidean-Distance 87.13 -
Accuracy Dot-Product 85.93 Threshold: 0.8320
F1 Dot-Product 82.89 Threshold: 0.8178
Precision Dot-Product 77.43 -
Recall Dot-Product 89.18 -
Average Precision Dot-Product 87.14 -

Training

The model was trained with the parameters:

DataLoader:

torch.utils.data.dataloader.DataLoader of length 4673 with parameters:

{'batch_size': 64, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}

Loss:

sentence_transformers.losses.OnlineContrastiveLoss.OnlineContrastiveLoss

Parameters of the fit()-Method:

{
    "epochs": 1,
    "evaluation_steps": 0,
    "evaluator": "sentence_transformers.evaluation.BinaryClassificationEvaluator.BinaryClassificationEvaluator",
    "max_grad_norm": 1,
    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
    "optimizer_params": {
        "lr": 2e-05
    },
    "scheduler": "WarmupLinear",
    "steps_per_epoch": null,
    "warmup_steps": 2,
    "weight_decay": 0.01
}

Potential Improvements

One way to improve the results of this model is to use a larger checkpoint of T5. This was trained with the T5-base checkpoint.

The larger checkpoints are:

Checkpoint # of Train Params
T5-Base 220 Million*
T5-Large 770 Million
T5-3B 3 Billion
T5-11B 11 Billion

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 34, 'do_lower_case': False}) with Transformer model: T5EncoderModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
  (2): Dense({'in_features': 768, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Normalize()
)

Citing & Authors

Dataset Source: https://www.kaggle.com/datasets/quora/question-pairs-dataset

Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.