metadata
license: apache-2.0
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- feature-extraction
- sentence-similarity
- transformers
- generated_from_trainer
datasets:
- squad
- newsqa
- LLukas22/cqadupstack
- LLukas22/fiqa
- LLukas22/scidocs
- deepset/germanquad
- LLukas22/nq
language:
- en
- de
all-MiniLM-L12-v2-embedding-all
This model is a fine-tuned version of all-MiniLM-L12-v2 on the following datasets: squad, newsqa, LLukas22/cqadupstack, LLukas22/fiqa, LLukas22/scidocs, deepset/germanquad, LLukas22/nq.
Usage (Sentence-Transformers)
Using this model becomes easy when you have sentence-transformers installed:
pip install -U sentence-transformers
Then you can use the model like this:
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('LLukas22/all-MiniLM-L12-v2-embedding-all')
embeddings = model.encode(sentences)
print(embeddings)
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1E+00
- per device batch size: 60
- effective batch size: 180
- seed: 42
- optimizer: AdamW with betas (0.9,0.999) and eps 1E-08
- weight decay: 2E-02
- D-Adaptation: True
- Warmup: True
- number of epochs: 20
- mixed_precision_training: bf16
Training results
Epoch | Train Loss | Validation Loss |
---|---|---|
0 | 0.0708 | 0.0619 |
1 | 0.0609 | 0.0567 |
2 | 0.0531 | 0.0542 |
3 | 0.0475 | 0.0528 |
4 | 0.0428 | 0.0521 |
5 | 0.0389 | 0.0513 |
6 | 0.0352 | 0.0508 |
7 | 0.0322 | 0.0494 |
8 | 0.0289 | 0.0485 |
9 | 0.0264 | 0.0483 |
10 | 0.0242 | 0.0466 |
11 | 0.0221 | 0.0459 |
12 | 0.0204 | 0.0469 |
13 | 0.0189 | 0.0459 |
Evaluation results
Epoch | top_1 | top_3 | top_5 | top_10 | top_25 |
---|---|---|---|---|---|
0 | 0.507 | 0.665 | 0.721 | 0.784 | 0.847 |
1 | 0.501 | 0.661 | 0.719 | 0.783 | 0.846 |
2 | 0.508 | 0.669 | 0.726 | 0.789 | 0.851 |
3 | 0.507 | 0.665 | 0.722 | 0.785 | 0.85 |
4 | 0.506 | 0.667 | 0.724 | 0.788 | 0.851 |
5 | 0.511 | 0.673 | 0.731 | 0.795 | 0.857 |
6 | 0.51 | 0.674 | 0.732 | 0.794 | 0.856 |
7 | 0.512 | 0.674 | 0.732 | 0.796 | 0.859 |
8 | 0.515 | 0.678 | 0.736 | 0.799 | 0.861 |
9 | 0.514 | 0.679 | 0.737 | 0.8 | 0.862 |
10 | 0.52 | 0.683 | 0.741 | 0.803 | 0.864 |
11 | 0.522 | 0.686 | 0.744 | 0.806 | 0.866 |
12 | 0.519 | 0.683 | 0.741 | 0.804 | 0.864 |
13 | 0.522 | 0.685 | 0.743 | 0.806 | 0.865 |
Framework versions
- Transformers: 4.25.1
- PyTorch: 2.0.0.dev20230210+cu118
- PyTorch Lightning: 1.8.6
- Datasets: 2.7.1
- Tokenizers: 0.13.1
- Sentence Transformers: 2.2.2
Additional Information
This model was trained as part of my Master's Thesis 'Evaluation of transformer based language models for use in service information systems'. The source code is available on Github.