--- pipeline_tag: sentence-similarity tags: - finetuner - mteb - sentence-transformers - feature-extraction - sentence-similarity datasets: - jinaai/negation-dataset language: en license: apache-2.0 model-index: - name: jina-triplets-large results: - task: type: Classification dataset: type: mteb/amazon_counterfactual name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 68.92537313432835 - type: ap value: 29.723758877632513 - type: f1 value: 61.909704211663794 - task: type: Classification dataset: type: mteb/amazon_polarity name: MTEB AmazonPolarityClassification config: default split: test revision: e2d317d38cd51312af73b3d32a06d1a08b442046 metrics: - type: accuracy value: 69.13669999999999 - type: ap value: 65.30216072238086 - type: f1 value: 67.1890891071034 - task: type: Classification dataset: type: mteb/amazon_reviews_multi name: MTEB AmazonReviewsClassification (en) config: en split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 31.384 - type: f1 value: 30.016752348953723 - task: type: Retrieval dataset: type: arguana name: MTEB ArguAna config: default split: test revision: None metrics: - type: map_at_1 value: 23.613 - type: map_at_10 value: 37.897 - type: map_at_100 value: 39.093 - type: map_at_1000 value: 39.109 - type: map_at_3 value: 32.824 - type: map_at_5 value: 35.679 - type: mrr_at_1 value: 23.826 - type: mrr_at_10 value: 37.997 - type: mrr_at_100 value: 39.186 - type: mrr_at_1000 value: 39.202 - type: mrr_at_3 value: 32.918 - type: mrr_at_5 value: 35.748999999999995 - type: ndcg_at_1 value: 23.613 - type: ndcg_at_10 value: 46.482 - type: ndcg_at_100 value: 51.55499999999999 - type: ndcg_at_1000 value: 51.974 - type: ndcg_at_3 value: 35.964 - type: ndcg_at_5 value: 41.144999999999996 - type: precision_at_1 value: 23.613 - type: precision_at_10 value: 7.417999999999999 - type: precision_at_100 value: 0.963 - type: precision_at_1000 value: 0.1 - type: precision_at_3 value: 15.031 - type: precision_at_5 value: 11.55 - type: recall_at_1 value: 23.613 - type: recall_at_10 value: 74.182 - type: recall_at_100 value: 96.30199999999999 - type: recall_at_1000 value: 99.57300000000001 - type: recall_at_3 value: 45.092 - type: recall_at_5 value: 57.752 - task: type: Clustering dataset: type: mteb/arxiv-clustering-p2p name: MTEB ArxivClusteringP2P config: default split: test revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d metrics: - type: v_measure value: 40.51285742156528 - task: type: Clustering dataset: type: mteb/arxiv-clustering-s2s name: MTEB ArxivClusteringS2S config: default split: test revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 metrics: - type: v_measure value: 31.5825964077496 - task: type: Reranking dataset: type: mteb/askubuntudupquestions-reranking name: MTEB AskUbuntuDupQuestions config: default split: test revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 metrics: - type: map value: 62.830281630546835 - type: mrr value: 75.93072593765115 - task: type: STS dataset: type: mteb/biosses-sts name: MTEB BIOSSES config: default split: test revision: d3fb88f8f02e40887cd149695127462bbcf29b4a metrics: - type: cos_sim_pearson value: 87.26764516732737 - type: cos_sim_spearman value: 84.42541766631741 - type: euclidean_pearson value: 48.71357447655235 - type: euclidean_spearman value: 49.2023259276511 - type: manhattan_pearson value: 48.36366272727299 - type: manhattan_spearman value: 48.457128224924354 - task: type: Classification dataset: type: mteb/banking77 name: MTEB Banking77Classification config: default split: test revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 metrics: - type: accuracy value: 85.3409090909091 - type: f1 value: 85.25262617676835 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-p2p name: MTEB BiorxivClusteringP2P config: default split: test revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 metrics: - type: v_measure value: 33.560193912974974 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-s2s name: MTEB BiorxivClusteringS2S config: default split: test revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 metrics: - type: v_measure value: 28.4426572644577 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackAndroidRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 27.822999999999997 - type: map_at_10 value: 39.088 - type: map_at_100 value: 40.561 - type: map_at_1000 value: 40.69 - type: map_at_3 value: 35.701 - type: map_at_5 value: 37.556 - type: mrr_at_1 value: 33.906 - type: mrr_at_10 value: 44.527 - type: mrr_at_100 value: 45.403999999999996 - type: mrr_at_1000 value: 45.452 - type: mrr_at_3 value: 41.726 - type: mrr_at_5 value: 43.314 - type: ndcg_at_1 value: 33.906 - type: ndcg_at_10 value: 45.591 - type: ndcg_at_100 value: 51.041000000000004 - type: ndcg_at_1000 value: 53.1 - type: ndcg_at_3 value: 40.324 - type: ndcg_at_5 value: 42.723 - type: precision_at_1 value: 33.906 - type: precision_at_10 value: 8.655 - type: precision_at_100 value: 1.418 - type: precision_at_1000 value: 0.19499999999999998 - type: precision_at_3 value: 19.123 - type: precision_at_5 value: 13.963000000000001 - type: recall_at_1 value: 27.822999999999997 - type: recall_at_10 value: 58.63699999999999 - type: recall_at_100 value: 80.874 - type: recall_at_1000 value: 93.82000000000001 - type: recall_at_3 value: 44.116 - type: recall_at_5 value: 50.178999999999995 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackEnglishRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.823999999999998 - type: map_at_10 value: 37.006 - type: map_at_100 value: 38.256 - type: map_at_1000 value: 38.397999999999996 - type: map_at_3 value: 34.011 - type: map_at_5 value: 35.643 - type: mrr_at_1 value: 34.268 - type: mrr_at_10 value: 43.374 - type: mrr_at_100 value: 44.096000000000004 - type: mrr_at_1000 value: 44.144 - type: mrr_at_3 value: 41.008 - type: mrr_at_5 value: 42.359 - type: ndcg_at_1 value: 34.268 - type: ndcg_at_10 value: 43.02 - type: ndcg_at_100 value: 47.747 - type: ndcg_at_1000 value: 50.019999999999996 - type: ndcg_at_3 value: 38.687 - type: ndcg_at_5 value: 40.647 - type: precision_at_1 value: 34.268 - type: precision_at_10 value: 8.261000000000001 - type: precision_at_100 value: 1.376 - type: precision_at_1000 value: 0.189 - type: precision_at_3 value: 19.108 - type: precision_at_5 value: 13.489999999999998 - type: recall_at_1 value: 26.823999999999998 - type: recall_at_10 value: 53.84100000000001 - type: recall_at_100 value: 73.992 - type: recall_at_1000 value: 88.524 - type: recall_at_3 value: 40.711000000000006 - type: recall_at_5 value: 46.477000000000004 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGamingRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 34.307 - type: map_at_10 value: 45.144 - type: map_at_100 value: 46.351 - type: map_at_1000 value: 46.414 - type: map_at_3 value: 42.315000000000005 - type: map_at_5 value: 43.991 - type: mrr_at_1 value: 39.06 - type: mrr_at_10 value: 48.612 - type: mrr_at_100 value: 49.425000000000004 - type: mrr_at_1000 value: 49.458999999999996 - type: mrr_at_3 value: 46.144 - type: mrr_at_5 value: 47.654999999999994 - type: ndcg_at_1 value: 39.06 - type: ndcg_at_10 value: 50.647 - type: ndcg_at_100 value: 55.620000000000005 - type: ndcg_at_1000 value: 56.976000000000006 - type: ndcg_at_3 value: 45.705 - type: ndcg_at_5 value: 48.269 - type: precision_at_1 value: 39.06 - type: precision_at_10 value: 8.082 - type: precision_at_100 value: 1.161 - type: precision_at_1000 value: 0.133 - type: precision_at_3 value: 20.376 - type: precision_at_5 value: 14.069 - type: recall_at_1 value: 34.307 - type: recall_at_10 value: 63.497 - type: recall_at_100 value: 85.038 - type: recall_at_1000 value: 94.782 - type: recall_at_3 value: 50.209 - type: recall_at_5 value: 56.525000000000006 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGisRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.448 - type: map_at_10 value: 34.86 - type: map_at_100 value: 36.004999999999995 - type: map_at_1000 value: 36.081 - type: map_at_3 value: 32.527 - type: map_at_5 value: 33.955 - type: mrr_at_1 value: 28.701 - type: mrr_at_10 value: 36.909 - type: mrr_at_100 value: 37.89 - type: mrr_at_1000 value: 37.945 - type: mrr_at_3 value: 34.576 - type: mrr_at_5 value: 35.966 - type: ndcg_at_1 value: 28.701 - type: ndcg_at_10 value: 39.507999999999996 - type: ndcg_at_100 value: 45.056000000000004 - type: ndcg_at_1000 value: 47.034 - type: ndcg_at_3 value: 34.985 - type: ndcg_at_5 value: 37.384 - type: precision_at_1 value: 28.701 - type: precision_at_10 value: 5.921 - type: precision_at_100 value: 0.914 - type: precision_at_1000 value: 0.11199999999999999 - type: precision_at_3 value: 14.689 - type: precision_at_5 value: 10.237 - type: recall_at_1 value: 26.448 - type: recall_at_10 value: 51.781 - type: recall_at_100 value: 77.142 - type: recall_at_1000 value: 92.10000000000001 - type: recall_at_3 value: 39.698 - type: recall_at_5 value: 45.469 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackMathematicaRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 14.174000000000001 - type: map_at_10 value: 22.019 - type: map_at_100 value: 23.18 - type: map_at_1000 value: 23.304 - type: map_at_3 value: 19.332 - type: map_at_5 value: 20.816000000000003 - type: mrr_at_1 value: 17.785999999999998 - type: mrr_at_10 value: 26.233 - type: mrr_at_100 value: 27.254 - type: mrr_at_1000 value: 27.328000000000003 - type: mrr_at_3 value: 23.653 - type: mrr_at_5 value: 25.095 - type: ndcg_at_1 value: 17.785999999999998 - type: ndcg_at_10 value: 27.236 - type: ndcg_at_100 value: 32.932 - type: ndcg_at_1000 value: 36.134 - type: ndcg_at_3 value: 22.33 - type: ndcg_at_5 value: 24.573999999999998 - type: precision_at_1 value: 17.785999999999998 - type: precision_at_10 value: 5.286 - type: precision_at_100 value: 0.9369999999999999 - type: precision_at_1000 value: 0.136 - type: precision_at_3 value: 11.07 - type: precision_at_5 value: 8.308 - type: recall_at_1 value: 14.174000000000001 - type: recall_at_10 value: 39.135 - type: recall_at_100 value: 64.095 - type: recall_at_1000 value: 87.485 - type: recall_at_3 value: 25.496999999999996 - type: recall_at_5 value: 31.148999999999997 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackPhysicsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.371000000000002 - type: map_at_10 value: 33.074999999999996 - type: map_at_100 value: 34.486 - type: map_at_1000 value: 34.608 - type: map_at_3 value: 30.483 - type: map_at_5 value: 31.972 - type: mrr_at_1 value: 29.548000000000002 - type: mrr_at_10 value: 38.431 - type: mrr_at_100 value: 39.347 - type: mrr_at_1000 value: 39.4 - type: mrr_at_3 value: 35.980000000000004 - type: mrr_at_5 value: 37.413999999999994 - type: ndcg_at_1 value: 29.548000000000002 - type: ndcg_at_10 value: 38.552 - type: ndcg_at_100 value: 44.598 - type: ndcg_at_1000 value: 47.0 - type: ndcg_at_3 value: 34.109 - type: ndcg_at_5 value: 36.263 - type: precision_at_1 value: 29.548000000000002 - type: precision_at_10 value: 6.92 - type: precision_at_100 value: 1.179 - type: precision_at_1000 value: 0.159 - type: precision_at_3 value: 16.137 - type: precision_at_5 value: 11.511000000000001 - type: recall_at_1 value: 24.371000000000002 - type: recall_at_10 value: 49.586999999999996 - type: recall_at_100 value: 75.15899999999999 - type: recall_at_1000 value: 91.06 - type: recall_at_3 value: 37.09 - type: recall_at_5 value: 42.588 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackProgrammersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.517 - type: map_at_10 value: 32.969 - type: map_at_100 value: 34.199 - type: map_at_1000 value: 34.322 - type: map_at_3 value: 30.270999999999997 - type: map_at_5 value: 31.863000000000003 - type: mrr_at_1 value: 30.479 - type: mrr_at_10 value: 38.633 - type: mrr_at_100 value: 39.522 - type: mrr_at_1000 value: 39.583 - type: mrr_at_3 value: 36.454 - type: mrr_at_5 value: 37.744 - type: ndcg_at_1 value: 30.479 - type: ndcg_at_10 value: 38.269 - type: ndcg_at_100 value: 43.91 - type: ndcg_at_1000 value: 46.564 - type: ndcg_at_3 value: 34.03 - type: ndcg_at_5 value: 36.155 - type: precision_at_1 value: 30.479 - type: precision_at_10 value: 6.815 - type: precision_at_100 value: 1.138 - type: precision_at_1000 value: 0.158 - type: precision_at_3 value: 16.058 - type: precision_at_5 value: 11.416 - type: recall_at_1 value: 24.517 - type: recall_at_10 value: 48.559000000000005 - type: recall_at_100 value: 73.307 - type: recall_at_1000 value: 91.508 - type: recall_at_3 value: 36.563 - type: recall_at_5 value: 42.375 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackStatsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 23.388 - type: map_at_10 value: 29.408 - type: map_at_100 value: 30.452 - type: map_at_1000 value: 30.546 - type: map_at_3 value: 27.139000000000003 - type: map_at_5 value: 28.402 - type: mrr_at_1 value: 25.46 - type: mrr_at_10 value: 31.966 - type: mrr_at_100 value: 32.879999999999995 - type: mrr_at_1000 value: 32.944 - type: mrr_at_3 value: 29.755 - type: mrr_at_5 value: 30.974 - type: ndcg_at_1 value: 25.46 - type: ndcg_at_10 value: 33.449 - type: ndcg_at_100 value: 38.67 - type: ndcg_at_1000 value: 41.035 - type: ndcg_at_3 value: 29.048000000000002 - type: ndcg_at_5 value: 31.127 - type: precision_at_1 value: 25.46 - type: precision_at_10 value: 5.199 - type: precision_at_100 value: 0.8670000000000001 - type: precision_at_1000 value: 0.11399999999999999 - type: precision_at_3 value: 12.168 - type: precision_at_5 value: 8.62 - type: recall_at_1 value: 23.388 - type: recall_at_10 value: 43.428 - type: recall_at_100 value: 67.245 - type: recall_at_1000 value: 84.75399999999999 - type: recall_at_3 value: 31.416 - type: recall_at_5 value: 36.451 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackTexRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 17.136000000000003 - type: map_at_10 value: 24.102999999999998 - type: map_at_100 value: 25.219 - type: map_at_1000 value: 25.344 - type: map_at_3 value: 22.004 - type: map_at_5 value: 23.145 - type: mrr_at_1 value: 20.613 - type: mrr_at_10 value: 27.753 - type: mrr_at_100 value: 28.698 - type: mrr_at_1000 value: 28.776000000000003 - type: mrr_at_3 value: 25.711000000000002 - type: mrr_at_5 value: 26.795 - type: ndcg_at_1 value: 20.613 - type: ndcg_at_10 value: 28.510999999999996 - type: ndcg_at_100 value: 33.924 - type: ndcg_at_1000 value: 36.849 - type: ndcg_at_3 value: 24.664 - type: ndcg_at_5 value: 26.365 - type: precision_at_1 value: 20.613 - type: precision_at_10 value: 5.069 - type: precision_at_100 value: 0.918 - type: precision_at_1000 value: 0.136 - type: precision_at_3 value: 11.574 - type: precision_at_5 value: 8.211 - type: recall_at_1 value: 17.136000000000003 - type: recall_at_10 value: 38.232 - type: recall_at_100 value: 62.571 - type: recall_at_1000 value: 83.23 - type: recall_at_3 value: 27.468999999999998 - type: recall_at_5 value: 31.852999999999998 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackUnixRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 25.580000000000002 - type: map_at_10 value: 33.449 - type: map_at_100 value: 34.58 - type: map_at_1000 value: 34.692 - type: map_at_3 value: 30.660999999999998 - type: map_at_5 value: 32.425 - type: mrr_at_1 value: 30.037000000000003 - type: mrr_at_10 value: 37.443 - type: mrr_at_100 value: 38.32 - type: mrr_at_1000 value: 38.384 - type: mrr_at_3 value: 34.778999999999996 - type: mrr_at_5 value: 36.458 - type: ndcg_at_1 value: 30.037000000000003 - type: ndcg_at_10 value: 38.46 - type: ndcg_at_100 value: 43.746 - type: ndcg_at_1000 value: 46.28 - type: ndcg_at_3 value: 33.52 - type: ndcg_at_5 value: 36.175000000000004 - type: precision_at_1 value: 30.037000000000003 - type: precision_at_10 value: 6.418 - type: precision_at_100 value: 1.0210000000000001 - type: precision_at_1000 value: 0.136 - type: precision_at_3 value: 15.018999999999998 - type: precision_at_5 value: 10.877 - type: recall_at_1 value: 25.580000000000002 - type: recall_at_10 value: 49.830000000000005 - type: recall_at_100 value: 73.04899999999999 - type: recall_at_1000 value: 90.751 - type: recall_at_3 value: 36.370999999999995 - type: recall_at_5 value: 43.104 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWebmastersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.071 - type: map_at_10 value: 33.384 - type: map_at_100 value: 35.004999999999995 - type: map_at_1000 value: 35.215999999999994 - type: map_at_3 value: 30.459000000000003 - type: map_at_5 value: 31.769 - type: mrr_at_1 value: 28.854000000000003 - type: mrr_at_10 value: 37.512 - type: mrr_at_100 value: 38.567 - type: mrr_at_1000 value: 38.618 - type: mrr_at_3 value: 35.211 - type: mrr_at_5 value: 36.13 - type: ndcg_at_1 value: 28.854000000000003 - type: ndcg_at_10 value: 39.216 - type: ndcg_at_100 value: 45.214 - type: ndcg_at_1000 value: 47.573 - type: ndcg_at_3 value: 34.597 - type: ndcg_at_5 value: 36.063 - type: precision_at_1 value: 28.854000000000003 - type: precision_at_10 value: 7.648000000000001 - type: precision_at_100 value: 1.545 - type: precision_at_1000 value: 0.241 - type: precision_at_3 value: 16.667 - type: precision_at_5 value: 11.818 - type: recall_at_1 value: 24.071 - type: recall_at_10 value: 50.802 - type: recall_at_100 value: 77.453 - type: recall_at_1000 value: 92.304 - type: recall_at_3 value: 36.846000000000004 - type: recall_at_5 value: 41.14 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWordpressRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 23.395 - type: map_at_10 value: 29.189999999999998 - type: map_at_100 value: 30.226999999999997 - type: map_at_1000 value: 30.337999999999997 - type: map_at_3 value: 27.342 - type: map_at_5 value: 28.116999999999997 - type: mrr_at_1 value: 25.323 - type: mrr_at_10 value: 31.241000000000003 - type: mrr_at_100 value: 32.225 - type: mrr_at_1000 value: 32.304 - type: mrr_at_3 value: 29.452 - type: mrr_at_5 value: 30.209000000000003 - type: ndcg_at_1 value: 25.323 - type: ndcg_at_10 value: 33.024 - type: ndcg_at_100 value: 38.279 - type: ndcg_at_1000 value: 41.026 - type: ndcg_at_3 value: 29.243000000000002 - type: ndcg_at_5 value: 30.564000000000004 - type: precision_at_1 value: 25.323 - type: precision_at_10 value: 4.972 - type: precision_at_100 value: 0.8210000000000001 - type: precision_at_1000 value: 0.116 - type: precision_at_3 value: 12.076 - type: precision_at_5 value: 8.133 - type: recall_at_1 value: 23.395 - type: recall_at_10 value: 42.994 - type: recall_at_100 value: 66.985 - type: recall_at_1000 value: 87.483 - type: recall_at_3 value: 32.505 - type: recall_at_5 value: 35.721000000000004 - task: type: Retrieval dataset: type: climate-fever name: MTEB ClimateFEVER config: default split: test revision: None metrics: - type: map_at_1 value: 8.322000000000001 - type: map_at_10 value: 14.491000000000001 - type: map_at_100 value: 16.066 - type: map_at_1000 value: 16.238 - type: map_at_3 value: 12.235 - type: map_at_5 value: 13.422999999999998 - type: mrr_at_1 value: 19.479 - type: mrr_at_10 value: 29.38 - type: mrr_at_100 value: 30.520999999999997 - type: mrr_at_1000 value: 30.570999999999998 - type: mrr_at_3 value: 26.395000000000003 - type: mrr_at_5 value: 27.982000000000003 - type: ndcg_at_1 value: 19.479 - type: ndcg_at_10 value: 21.215 - type: ndcg_at_100 value: 27.966 - type: ndcg_at_1000 value: 31.324 - type: ndcg_at_3 value: 17.194000000000003 - type: ndcg_at_5 value: 18.593 - type: precision_at_1 value: 19.479 - type: precision_at_10 value: 6.5280000000000005 - type: precision_at_100 value: 1.359 - type: precision_at_1000 value: 0.198 - type: precision_at_3 value: 12.703999999999999 - type: precision_at_5 value: 9.655 - type: recall_at_1 value: 8.322000000000001 - type: recall_at_10 value: 26.165 - type: recall_at_100 value: 49.573 - type: recall_at_1000 value: 68.501 - type: recall_at_3 value: 16.179 - type: recall_at_5 value: 20.175 - task: type: Retrieval dataset: type: dbpedia-entity name: MTEB DBPedia config: default split: test revision: None metrics: - type: map_at_1 value: 8.003 - type: map_at_10 value: 16.087 - type: map_at_100 value: 21.363 - type: map_at_1000 value: 22.64 - type: map_at_3 value: 12.171999999999999 - type: map_at_5 value: 13.866 - type: mrr_at_1 value: 61.25000000000001 - type: mrr_at_10 value: 68.626 - type: mrr_at_100 value: 69.134 - type: mrr_at_1000 value: 69.144 - type: mrr_at_3 value: 67.042 - type: mrr_at_5 value: 67.929 - type: ndcg_at_1 value: 49.0 - type: ndcg_at_10 value: 34.132 - type: ndcg_at_100 value: 37.545 - type: ndcg_at_1000 value: 44.544 - type: ndcg_at_3 value: 38.946999999999996 - type: ndcg_at_5 value: 36.317 - type: precision_at_1 value: 61.25000000000001 - type: precision_at_10 value: 26.325 - type: precision_at_100 value: 8.173 - type: precision_at_1000 value: 1.778 - type: precision_at_3 value: 41.667 - type: precision_at_5 value: 34.300000000000004 - type: recall_at_1 value: 8.003 - type: recall_at_10 value: 20.577 - type: recall_at_100 value: 41.884 - type: recall_at_1000 value: 64.36500000000001 - type: recall_at_3 value: 13.602 - type: recall_at_5 value: 16.41 - task: type: Classification dataset: type: mteb/emotion name: MTEB EmotionClassification config: default split: test revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 metrics: - type: accuracy value: 45.835 - type: f1 value: 41.66455981281837 - task: type: Retrieval dataset: type: fever name: MTEB FEVER config: default split: test revision: None metrics: - type: map_at_1 value: 55.717000000000006 - type: map_at_10 value: 66.34100000000001 - type: map_at_100 value: 66.776 - type: map_at_1000 value: 66.794 - type: map_at_3 value: 64.386 - type: map_at_5 value: 65.566 - type: mrr_at_1 value: 60.141 - type: mrr_at_10 value: 70.928 - type: mrr_at_100 value: 71.29299999999999 - type: mrr_at_1000 value: 71.30199999999999 - type: mrr_at_3 value: 69.07900000000001 - type: mrr_at_5 value: 70.244 - type: ndcg_at_1 value: 60.141 - type: ndcg_at_10 value: 71.90100000000001 - type: ndcg_at_100 value: 73.836 - type: ndcg_at_1000 value: 74.214 - type: ndcg_at_3 value: 68.203 - type: ndcg_at_5 value: 70.167 - type: precision_at_1 value: 60.141 - type: precision_at_10 value: 9.268 - type: precision_at_100 value: 1.03 - type: precision_at_1000 value: 0.108 - type: precision_at_3 value: 27.028000000000002 - type: precision_at_5 value: 17.342 - type: recall_at_1 value: 55.717000000000006 - type: recall_at_10 value: 84.66799999999999 - type: recall_at_100 value: 93.28 - type: recall_at_1000 value: 95.887 - type: recall_at_3 value: 74.541 - type: recall_at_5 value: 79.389 - task: type: Retrieval dataset: type: fiqa name: MTEB FiQA2018 config: default split: test revision: None metrics: - type: map_at_1 value: 17.744 - type: map_at_10 value: 29.554000000000002 - type: map_at_100 value: 31.180000000000003 - type: map_at_1000 value: 31.372 - type: map_at_3 value: 25.6 - type: map_at_5 value: 27.642 - type: mrr_at_1 value: 35.802 - type: mrr_at_10 value: 44.812999999999995 - type: mrr_at_100 value: 45.56 - type: mrr_at_1000 value: 45.606 - type: mrr_at_3 value: 42.181000000000004 - type: mrr_at_5 value: 43.516 - type: ndcg_at_1 value: 35.802 - type: ndcg_at_10 value: 37.269999999999996 - type: ndcg_at_100 value: 43.575 - type: ndcg_at_1000 value: 46.916000000000004 - type: ndcg_at_3 value: 33.511 - type: ndcg_at_5 value: 34.504000000000005 - type: precision_at_1 value: 35.802 - type: precision_at_10 value: 10.448 - type: precision_at_100 value: 1.7129999999999999 - type: precision_at_1000 value: 0.231 - type: precision_at_3 value: 22.531000000000002 - type: precision_at_5 value: 16.512 - type: recall_at_1 value: 17.744 - type: recall_at_10 value: 44.616 - type: recall_at_100 value: 68.51899999999999 - type: recall_at_1000 value: 88.495 - type: recall_at_3 value: 30.235 - type: recall_at_5 value: 35.821999999999996 - task: type: Retrieval dataset: type: hotpotqa name: MTEB HotpotQA config: default split: test revision: None metrics: - type: map_at_1 value: 33.315 - type: map_at_10 value: 45.932 - type: map_at_100 value: 46.708 - type: map_at_1000 value: 46.778999999999996 - type: map_at_3 value: 43.472 - type: map_at_5 value: 45.022 - type: mrr_at_1 value: 66.631 - type: mrr_at_10 value: 73.083 - type: mrr_at_100 value: 73.405 - type: mrr_at_1000 value: 73.421 - type: mrr_at_3 value: 71.756 - type: mrr_at_5 value: 72.616 - type: ndcg_at_1 value: 66.631 - type: ndcg_at_10 value: 54.949000000000005 - type: ndcg_at_100 value: 57.965 - type: ndcg_at_1000 value: 59.467000000000006 - type: ndcg_at_3 value: 51.086 - type: ndcg_at_5 value: 53.272 - type: precision_at_1 value: 66.631 - type: precision_at_10 value: 11.178 - type: precision_at_100 value: 1.3559999999999999 - type: precision_at_1000 value: 0.156 - type: precision_at_3 value: 31.582 - type: precision_at_5 value: 20.678 - type: recall_at_1 value: 33.315 - type: recall_at_10 value: 55.888000000000005 - type: recall_at_100 value: 67.812 - type: recall_at_1000 value: 77.839 - type: recall_at_3 value: 47.373 - type: recall_at_5 value: 51.695 - task: type: Classification dataset: type: mteb/imdb name: MTEB ImdbClassification config: default split: test revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 metrics: - type: accuracy value: 66.424 - type: ap value: 61.132235499939256 - type: f1 value: 66.07094958225315 - task: type: Retrieval dataset: type: msmarco name: MTEB MSMARCO config: default split: dev revision: None metrics: - type: map_at_1 value: 21.575 - type: map_at_10 value: 33.509 - type: map_at_100 value: 34.725 - type: map_at_1000 value: 34.775 - type: map_at_3 value: 29.673 - type: map_at_5 value: 31.805 - type: mrr_at_1 value: 22.235 - type: mrr_at_10 value: 34.1 - type: mrr_at_100 value: 35.254999999999995 - type: mrr_at_1000 value: 35.299 - type: mrr_at_3 value: 30.334 - type: mrr_at_5 value: 32.419 - type: ndcg_at_1 value: 22.235 - type: ndcg_at_10 value: 40.341 - type: ndcg_at_100 value: 46.161 - type: ndcg_at_1000 value: 47.400999999999996 - type: ndcg_at_3 value: 32.482 - type: ndcg_at_5 value: 36.269 - type: precision_at_1 value: 22.235 - type: precision_at_10 value: 6.422999999999999 - type: precision_at_100 value: 0.9329999999999999 - type: precision_at_1000 value: 0.104 - type: precision_at_3 value: 13.835 - type: precision_at_5 value: 10.226 - type: recall_at_1 value: 21.575 - type: recall_at_10 value: 61.448 - type: recall_at_100 value: 88.289 - type: recall_at_1000 value: 97.76899999999999 - type: recall_at_3 value: 39.971000000000004 - type: recall_at_5 value: 49.053000000000004 - task: type: Classification dataset: type: mteb/mtop_domain name: MTEB MTOPDomainClassification (en) config: en split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 92.83401732786137 - type: f1 value: 92.47678691291068 - task: type: Classification dataset: type: mteb/mtop_intent name: MTEB MTOPIntentClassification (en) config: en split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 76.08983128134975 - type: f1 value: 59.782936393820904 - task: type: Classification dataset: type: mteb/amazon_massive_intent name: MTEB MassiveIntentClassification (en) config: en split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 72.73032952252858 - type: f1 value: 70.72684765888265 - task: type: Classification dataset: type: mteb/amazon_massive_scenario name: MTEB MassiveScenarioClassification (en) config: en split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 77.08473436449226 - type: f1 value: 77.31457411257054 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-p2p name: MTEB MedrxivClusteringP2P config: default split: test revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 metrics: - type: v_measure value: 30.11980959210532 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-s2s name: MTEB MedrxivClusteringS2S config: default split: test revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 metrics: - type: v_measure value: 25.2587629106119 - task: type: Reranking dataset: type: mteb/mind_small name: MTEB MindSmallReranking config: default split: test revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 metrics: - type: map value: 31.48268319779204 - type: mrr value: 32.501885728964304 - task: type: Retrieval dataset: type: nfcorpus name: MTEB NFCorpus config: default split: test revision: None metrics: - type: map_at_1 value: 5.284 - type: map_at_10 value: 11.509 - type: map_at_100 value: 14.624 - type: map_at_1000 value: 16.035 - type: map_at_3 value: 8.347999999999999 - type: map_at_5 value: 9.919 - type: mrr_at_1 value: 43.344 - type: mrr_at_10 value: 52.303999999999995 - type: mrr_at_100 value: 52.994 - type: mrr_at_1000 value: 53.032999999999994 - type: mrr_at_3 value: 50.361 - type: mrr_at_5 value: 51.754 - type: ndcg_at_1 value: 41.176 - type: ndcg_at_10 value: 32.244 - type: ndcg_at_100 value: 29.916999999999998 - type: ndcg_at_1000 value: 38.753 - type: ndcg_at_3 value: 36.856 - type: ndcg_at_5 value: 35.394999999999996 - type: precision_at_1 value: 43.034 - type: precision_at_10 value: 24.118000000000002 - type: precision_at_100 value: 7.926 - type: precision_at_1000 value: 2.045 - type: precision_at_3 value: 34.675 - type: precision_at_5 value: 31.146 - type: recall_at_1 value: 5.284 - type: recall_at_10 value: 15.457 - type: recall_at_100 value: 30.914 - type: recall_at_1000 value: 63.788999999999994 - type: recall_at_3 value: 9.596 - type: recall_at_5 value: 12.391 - task: type: Retrieval dataset: type: nq name: MTEB NQ config: default split: test revision: None metrics: - type: map_at_1 value: 29.537999999999997 - type: map_at_10 value: 43.99 - type: map_at_100 value: 45.003 - type: map_at_1000 value: 45.04 - type: map_at_3 value: 39.814 - type: map_at_5 value: 42.166 - type: mrr_at_1 value: 33.256 - type: mrr_at_10 value: 46.487 - type: mrr_at_100 value: 47.264 - type: mrr_at_1000 value: 47.29 - type: mrr_at_3 value: 43.091 - type: mrr_at_5 value: 45.013999999999996 - type: ndcg_at_1 value: 33.256 - type: ndcg_at_10 value: 51.403 - type: ndcg_at_100 value: 55.706999999999994 - type: ndcg_at_1000 value: 56.586000000000006 - type: ndcg_at_3 value: 43.559 - type: ndcg_at_5 value: 47.426 - type: precision_at_1 value: 33.256 - type: precision_at_10 value: 8.540000000000001 - type: precision_at_100 value: 1.093 - type: precision_at_1000 value: 0.11800000000000001 - type: precision_at_3 value: 19.834 - type: precision_at_5 value: 14.143 - type: recall_at_1 value: 29.537999999999997 - type: recall_at_10 value: 71.5 - type: recall_at_100 value: 90.25 - type: recall_at_1000 value: 96.82600000000001 - type: recall_at_3 value: 51.108 - type: recall_at_5 value: 60.006 - task: type: Retrieval dataset: type: quora name: MTEB QuoraRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 70.526 - type: map_at_10 value: 84.342 - type: map_at_100 value: 84.985 - type: map_at_1000 value: 85.003 - type: map_at_3 value: 81.472 - type: map_at_5 value: 83.292 - type: mrr_at_1 value: 81.17 - type: mrr_at_10 value: 87.33999999999999 - type: mrr_at_100 value: 87.445 - type: mrr_at_1000 value: 87.446 - type: mrr_at_3 value: 86.387 - type: mrr_at_5 value: 87.042 - type: ndcg_at_1 value: 81.19 - type: ndcg_at_10 value: 88.088 - type: ndcg_at_100 value: 89.35 - type: ndcg_at_1000 value: 89.462 - type: ndcg_at_3 value: 85.319 - type: ndcg_at_5 value: 86.858 - type: precision_at_1 value: 81.19 - type: precision_at_10 value: 13.33 - type: precision_at_100 value: 1.528 - type: precision_at_1000 value: 0.157 - type: precision_at_3 value: 37.31 - type: precision_at_5 value: 24.512 - type: recall_at_1 value: 70.526 - type: recall_at_10 value: 95.166 - type: recall_at_100 value: 99.479 - type: recall_at_1000 value: 99.984 - type: recall_at_3 value: 87.124 - type: recall_at_5 value: 91.53 - task: type: Clustering dataset: type: mteb/reddit-clustering name: MTEB RedditClustering config: default split: test revision: 24640382cdbf8abc73003fb0fa6d111a705499eb metrics: - type: v_measure value: 45.049073872893494 - task: type: Clustering dataset: type: mteb/reddit-clustering-p2p name: MTEB RedditClusteringP2P config: default split: test revision: 282350215ef01743dc01b456c7f5241fa8937f16 metrics: - type: v_measure value: 55.13810914528368 - task: type: Retrieval dataset: type: scidocs name: MTEB SCIDOCS config: default split: test revision: None metrics: - type: map_at_1 value: 4.593 - type: map_at_10 value: 10.907 - type: map_at_100 value: 12.888 - type: map_at_1000 value: 13.167000000000002 - type: map_at_3 value: 7.936 - type: map_at_5 value: 9.31 - type: mrr_at_1 value: 22.7 - type: mrr_at_10 value: 32.509 - type: mrr_at_100 value: 33.69 - type: mrr_at_1000 value: 33.747 - type: mrr_at_3 value: 29.599999999999998 - type: mrr_at_5 value: 31.155 - type: ndcg_at_1 value: 22.7 - type: ndcg_at_10 value: 18.445 - type: ndcg_at_100 value: 26.241999999999997 - type: ndcg_at_1000 value: 31.409 - type: ndcg_at_3 value: 17.864 - type: ndcg_at_5 value: 15.232999999999999 - type: precision_at_1 value: 22.7 - type: precision_at_10 value: 9.43 - type: precision_at_100 value: 2.061 - type: precision_at_1000 value: 0.331 - type: precision_at_3 value: 16.467000000000002 - type: precision_at_5 value: 13.08 - type: recall_at_1 value: 4.593 - type: recall_at_10 value: 19.115 - type: recall_at_100 value: 41.82 - type: recall_at_1000 value: 67.167 - type: recall_at_3 value: 9.983 - type: recall_at_5 value: 13.218 - task: type: STS dataset: type: mteb/sickr-sts name: MTEB SICK-R config: default split: test revision: a6ea5a8cab320b040a23452cc28066d9beae2cee metrics: - type: cos_sim_pearson value: 82.94432059816452 - type: cos_sim_spearman value: 79.19993315048852 - type: euclidean_pearson value: 72.43261099671753 - type: euclidean_spearman value: 71.51531114998619 - type: manhattan_pearson value: 71.83604124130447 - type: manhattan_spearman value: 71.24460392842295 - task: type: STS dataset: type: mteb/sts12-sts name: MTEB STS12 config: default split: test revision: a0d554a64d88156834ff5ae9920b964011b16384 metrics: - type: cos_sim_pearson value: 84.25401068481673 - type: cos_sim_spearman value: 74.5249604699309 - type: euclidean_pearson value: 71.1324859629043 - type: euclidean_spearman value: 58.77041705276752 - type: manhattan_pearson value: 71.01471521586141 - type: manhattan_spearman value: 58.69949381017865 - task: type: STS dataset: type: mteb/sts13-sts name: MTEB STS13 config: default split: test revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca metrics: - type: cos_sim_pearson value: 82.85731544223766 - type: cos_sim_spearman value: 83.15607264736185 - type: euclidean_pearson value: 75.8803249521361 - type: euclidean_spearman value: 76.4862168799065 - type: manhattan_pearson value: 75.80451454386811 - type: manhattan_spearman value: 76.35986831074699 - task: type: STS dataset: type: mteb/sts14-sts name: MTEB STS14 config: default split: test revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 metrics: - type: cos_sim_pearson value: 82.40669043798857 - type: cos_sim_spearman value: 78.08686090667834 - type: euclidean_pearson value: 74.48574712193803 - type: euclidean_spearman value: 70.79423012045118 - type: manhattan_pearson value: 74.39099211477354 - type: manhattan_spearman value: 70.73135427277684 - task: type: STS dataset: type: mteb/sts15-sts name: MTEB STS15 config: default split: test revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 metrics: - type: cos_sim_pearson value: 86.03027014209859 - type: cos_sim_spearman value: 86.91082847840946 - type: euclidean_pearson value: 69.13187603971996 - type: euclidean_spearman value: 70.0370035340552 - type: manhattan_pearson value: 69.2586635812031 - type: manhattan_spearman value: 70.18638387118486 - task: type: STS dataset: type: mteb/sts16-sts name: MTEB STS16 config: default split: test revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 metrics: - type: cos_sim_pearson value: 82.41190748361883 - type: cos_sim_spearman value: 83.64850851235231 - type: euclidean_pearson value: 71.60523243575282 - type: euclidean_spearman value: 72.26134033805099 - type: manhattan_pearson value: 71.50771482066683 - type: manhattan_spearman value: 72.13707967973161 - task: type: STS dataset: type: mteb/sts17-crosslingual-sts name: MTEB STS17 (en-en) config: en-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 90.42838477648627 - type: cos_sim_spearman value: 90.15798155439076 - type: euclidean_pearson value: 77.09619972244516 - type: euclidean_spearman value: 75.5953488548861 - type: manhattan_pearson value: 77.36892406451771 - type: manhattan_spearman value: 75.76625156149356 - task: type: STS dataset: type: mteb/sts22-crosslingual-sts name: MTEB STS22 (en) config: en split: test revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 metrics: - type: cos_sim_pearson value: 65.76151154879307 - type: cos_sim_spearman value: 64.8846800918359 - type: euclidean_pearson value: 50.23302700257155 - type: euclidean_spearman value: 58.89455187289583 - type: manhattan_pearson value: 50.05498582284945 - type: manhattan_spearman value: 58.75893793871576 - task: type: STS dataset: type: mteb/stsbenchmark-sts name: MTEB STSBenchmark config: default split: test revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 metrics: - type: cos_sim_pearson value: 84.72381109169437 - type: cos_sim_spearman value: 84.59820928231167 - type: euclidean_pearson value: 74.85450857429493 - type: euclidean_spearman value: 73.83634052565915 - type: manhattan_pearson value: 74.97349743979106 - type: manhattan_spearman value: 73.9636470375881 - task: type: Reranking dataset: type: mteb/scidocs-reranking name: MTEB SciDocsRR config: default split: test revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab metrics: - type: map value: 80.96736259172798 - type: mrr value: 94.48378781712114 - task: type: Retrieval dataset: type: scifact name: MTEB SciFact config: default split: test revision: None metrics: - type: map_at_1 value: 46.344 - type: map_at_10 value: 54.962 - type: map_at_100 value: 55.772 - type: map_at_1000 value: 55.81700000000001 - type: map_at_3 value: 51.832 - type: map_at_5 value: 53.718999999999994 - type: mrr_at_1 value: 49.0 - type: mrr_at_10 value: 56.721 - type: mrr_at_100 value: 57.287 - type: mrr_at_1000 value: 57.330000000000005 - type: mrr_at_3 value: 54.056000000000004 - type: mrr_at_5 value: 55.822 - type: ndcg_at_1 value: 49.0 - type: ndcg_at_10 value: 59.757000000000005 - type: ndcg_at_100 value: 63.149 - type: ndcg_at_1000 value: 64.43100000000001 - type: ndcg_at_3 value: 54.105000000000004 - type: ndcg_at_5 value: 57.196999999999996 - type: precision_at_1 value: 49.0 - type: precision_at_10 value: 8.200000000000001 - type: precision_at_100 value: 1.0070000000000001 - type: precision_at_1000 value: 0.11100000000000002 - type: precision_at_3 value: 20.889 - type: precision_at_5 value: 14.399999999999999 - type: recall_at_1 value: 46.344 - type: recall_at_10 value: 72.722 - type: recall_at_100 value: 88.167 - type: recall_at_1000 value: 98.333 - type: recall_at_3 value: 57.994 - type: recall_at_5 value: 65.506 - task: type: PairClassification dataset: type: mteb/sprintduplicatequestions-pairclassification name: MTEB SprintDuplicateQuestions config: default split: test revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 metrics: - type: cos_sim_accuracy value: 99.83366336633664 - type: cos_sim_ap value: 96.09329747251944 - type: cos_sim_f1 value: 91.66255550074001 - type: cos_sim_precision value: 90.45764362220059 - type: cos_sim_recall value: 92.9 - type: dot_accuracy value: 99.32871287128712 - type: dot_ap value: 63.95436644147969 - type: dot_f1 value: 60.61814556331008 - type: dot_precision value: 60.437375745526836 - type: dot_recall value: 60.8 - type: euclidean_accuracy value: 99.66534653465347 - type: euclidean_ap value: 85.85143979761818 - type: euclidean_f1 value: 81.57033805888769 - type: euclidean_precision value: 89.68824940047962 - type: euclidean_recall value: 74.8 - type: manhattan_accuracy value: 99.65742574257426 - type: manhattan_ap value: 85.55693926348405 - type: manhattan_f1 value: 81.13804004214963 - type: manhattan_precision value: 85.74610244988864 - type: manhattan_recall value: 77.0 - type: max_accuracy value: 99.83366336633664 - type: max_ap value: 96.09329747251944 - type: max_f1 value: 91.66255550074001 - task: type: Clustering dataset: type: mteb/stackexchange-clustering name: MTEB StackExchangeClustering config: default split: test revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 metrics: - type: v_measure value: 45.23573510003245 - task: type: Clustering dataset: type: mteb/stackexchange-clustering-p2p name: MTEB StackExchangeClusteringP2P config: default split: test revision: 815ca46b2622cec33ccafc3735d572c266efdb44 metrics: - type: v_measure value: 33.37478638401161 - task: type: Reranking dataset: type: mteb/stackoverflowdupquestions-reranking name: MTEB StackOverflowDupQuestions config: default split: test revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 metrics: - type: map value: 50.375920467392476 - type: mrr value: 51.17302223919871 - task: type: Summarization dataset: type: mteb/summeval name: MTEB SummEval config: default split: test revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c metrics: - type: cos_sim_pearson value: 29.768864092288343 - type: cos_sim_spearman value: 29.854278347043266 - type: dot_pearson value: 20.51281723837505 - type: dot_spearman value: 21.799102540913665 - task: type: Retrieval dataset: type: trec-covid name: MTEB TRECCOVID config: default split: test revision: None metrics: - type: map_at_1 value: 0.2 - type: map_at_10 value: 1.202 - type: map_at_100 value: 6.729 - type: map_at_1000 value: 15.928 - type: map_at_3 value: 0.492 - type: map_at_5 value: 0.712 - type: mrr_at_1 value: 76.0 - type: mrr_at_10 value: 84.75 - type: mrr_at_100 value: 84.75 - type: mrr_at_1000 value: 84.75 - type: mrr_at_3 value: 83.0 - type: mrr_at_5 value: 84.5 - type: ndcg_at_1 value: 71.0 - type: ndcg_at_10 value: 57.253 - type: ndcg_at_100 value: 44.383 - type: ndcg_at_1000 value: 38.666 - type: ndcg_at_3 value: 64.324 - type: ndcg_at_5 value: 60.791 - type: precision_at_1 value: 76.0 - type: precision_at_10 value: 59.599999999999994 - type: precision_at_100 value: 45.440000000000005 - type: precision_at_1000 value: 17.458000000000002 - type: precision_at_3 value: 69.333 - type: precision_at_5 value: 63.2 - type: recall_at_1 value: 0.2 - type: recall_at_10 value: 1.4949999999999999 - type: recall_at_100 value: 10.266 - type: recall_at_1000 value: 35.853 - type: recall_at_3 value: 0.5349999999999999 - type: recall_at_5 value: 0.8109999999999999 - task: type: Retrieval dataset: type: webis-touche2020 name: MTEB Touche2020 config: default split: test revision: None metrics: - type: map_at_1 value: 2.0140000000000002 - type: map_at_10 value: 8.474 - type: map_at_100 value: 14.058000000000002 - type: map_at_1000 value: 15.381 - type: map_at_3 value: 4.508 - type: map_at_5 value: 5.87 - type: mrr_at_1 value: 22.448999999999998 - type: mrr_at_10 value: 37.242 - type: mrr_at_100 value: 38.291 - type: mrr_at_1000 value: 38.311 - type: mrr_at_3 value: 32.312999999999995 - type: mrr_at_5 value: 34.762 - type: ndcg_at_1 value: 20.408 - type: ndcg_at_10 value: 20.729 - type: ndcg_at_100 value: 33.064 - type: ndcg_at_1000 value: 44.324999999999996 - type: ndcg_at_3 value: 21.251 - type: ndcg_at_5 value: 20.28 - type: precision_at_1 value: 22.448999999999998 - type: precision_at_10 value: 18.98 - type: precision_at_100 value: 7.224 - type: precision_at_1000 value: 1.471 - type: precision_at_3 value: 22.448999999999998 - type: precision_at_5 value: 20.816000000000003 - type: recall_at_1 value: 2.0140000000000002 - type: recall_at_10 value: 13.96 - type: recall_at_100 value: 44.187 - type: recall_at_1000 value: 79.328 - type: recall_at_3 value: 5.345 - type: recall_at_5 value: 7.979 - task: type: Classification dataset: type: mteb/toxic_conversations_50k name: MTEB ToxicConversationsClassification config: default split: test revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c metrics: - type: accuracy value: 69.1312 - type: ap value: 12.606776505497608 - type: f1 value: 52.4112415600534 - task: type: Classification dataset: type: mteb/tweet_sentiment_extraction name: MTEB TweetSentimentExtractionClassification config: default split: test revision: d604517c81ca91fe16a244d1248fc021f9ecee7a metrics: - type: accuracy value: 58.16072439162422 - type: f1 value: 58.29152785435414 - task: type: Clustering dataset: type: mteb/twentynewsgroups-clustering name: MTEB TwentyNewsgroupsClustering config: default split: test revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 metrics: - type: v_measure value: 40.421119289825924 - task: type: PairClassification dataset: type: mteb/twittersemeval2015-pairclassification name: MTEB TwitterSemEval2015 config: default split: test revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 metrics: - type: cos_sim_accuracy value: 85.48012159504083 - type: cos_sim_ap value: 72.31974877212102 - type: cos_sim_f1 value: 67.96846573681019 - type: cos_sim_precision value: 62.89562289562289 - type: cos_sim_recall value: 73.93139841688654 - type: dot_accuracy value: 78.52416999463551 - type: dot_ap value: 43.65271285411479 - type: dot_f1 value: 46.94641449960599 - type: dot_precision value: 37.456774599182644 - type: dot_recall value: 62.875989445910285 - type: euclidean_accuracy value: 83.90057817249806 - type: euclidean_ap value: 65.96278727778665 - type: euclidean_f1 value: 63.35733232284957 - type: euclidean_precision value: 60.770535497940394 - type: euclidean_recall value: 66.17414248021109 - type: manhattan_accuracy value: 83.96614412588663 - type: manhattan_ap value: 66.03670273156699 - type: manhattan_f1 value: 63.49128406579917 - type: manhattan_precision value: 59.366391184573 - type: manhattan_recall value: 68.23218997361478 - type: max_accuracy value: 85.48012159504083 - type: max_ap value: 72.31974877212102 - type: max_f1 value: 67.96846573681019 - task: type: PairClassification dataset: type: mteb/twitterurlcorpus-pairclassification name: MTEB TwitterURLCorpus config: default split: test revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf metrics: - type: cos_sim_accuracy value: 88.97038848139093 - type: cos_sim_ap value: 85.982764495556 - type: cos_sim_f1 value: 78.73283281450284 - type: cos_sim_precision value: 75.07857791436754 - type: cos_sim_recall value: 82.7610101632276 - type: dot_accuracy value: 83.21108394458028 - type: dot_ap value: 70.97956937273386 - type: dot_f1 value: 66.53083038279111 - type: dot_precision value: 58.7551622418879 - type: dot_recall value: 76.67847243609486 - type: euclidean_accuracy value: 84.31520937633407 - type: euclidean_ap value: 74.67323411319909 - type: euclidean_f1 value: 67.21935410935676 - type: euclidean_precision value: 65.82773636430733 - type: euclidean_recall value: 68.67108099784416 - type: manhattan_accuracy value: 84.35013777312066 - type: manhattan_ap value: 74.66508905354597 - type: manhattan_f1 value: 67.28264162375038 - type: manhattan_precision value: 66.19970193740686 - type: manhattan_recall value: 68.40160147828766 - type: max_accuracy value: 88.97038848139093 - type: max_ap value: 85.982764495556 - type: max_f1 value: 78.73283281450284 ---

Finetuner logo: Finetuner helps you to create experiments in order to improve embeddings on search tasks. It accompanies you to deliver the last mile of performance-tuning for neural search applications.

The text embedding set trained by Jina AI, Finetuner team.

## Intented Usage & Model Info `jina-embedding-l-en-v1` is a language model that has been trained using Jina AI's Linnaeus-Clean dataset. This dataset consists of 380 million pairs of sentences, which include both query-document pairs. These pairs were obtained from various domains and were carefully selected through a thorough cleaning process. The Linnaeus-Full dataset, from which the Linnaeus-Clean dataset is derived, originally contained 1.6 billion sentence pairs. The model has a range of use cases, including information retrieval, semantic textual similarity, text reranking, and more. With a size of 330 million parameters, the model enables single-gpu inference while delivering better performance than our small and base model. Additionally, we provide the following options: - [`jina-embedding-t-en-v1`](https://huggingface.co/jinaai/jina-embedding-t-en-v1): 14 million parameters. - [`jina-embedding-s-en-v1`](https://huggingface.co/jinaai/jina-embedding-s-en-v1): 35 million parameters - [`jina-embedding-b-en-v1`](https://huggingface.co/jinaai/jina-embedding-b-en-v1): 110 million parameters. - [`jina-embedding-l-en-v1`](https://huggingface.co/jinaai/jina-embedding-l-en-v1): 330 million parameters **(you are here)**. - `jina-embedding-1b-en-v1`: 1.2 billion parameters, 10 times bert-base (soon). - `jina-embedding-6b-en-v1`: 6 billion parameters, 30 times bert-base (soon). ## Data & Parameters Please checkout our [technical blog](https://arxiv.org/abs/2307.11224). ## Metrics We compared the model against `all-minilm-l6-v2`/`all-mpnet-base-v2` from sbert and `text-embeddings-ada-002` from OpenAI: |Name|param |dimension| |------------------------------|-----|------| |all-minilm-l6-v2|23m |384| |all-mpnet-base-v2 |110m |768| |ada-embedding-002|Unknown/OpenAI API |1536| |jina-embedding-t-en-v1|14m |312| |jina-embedding-s-en-v1|35m |512| |jina-embedding-b-en-v1|110m |768| |jina-embedding-l-en-v1|330m |1024| |Name|STS12|STS13|STS14|STS15|STS16|STS17|TRECOVID|Quora|SciFact| |------------------------------|-----|-----|-----|-----|-----|-----|--------|-----|-----| |all-minilm-l6-v2|0.724|0.806|0.756|0.854|0.79 |0.876|0.473 |0.876|0.645 | |all-mpnet-base-v2|0.726|0.835|**0.78** |0.857|0.8 |**0.906**|0.513 |0.875|0.656 | |ada-embedding-002|0.698|0.833|0.761|0.861|**0.86** |0.903|**0.685** |0.876|**0.726** | |jina-embedding-t-en-v1|0.717|0.773|0.731|0.829|0.777|0.860|0.482 |0.840|0.522 | |jina-embedding-s-en-v1|0.743|0.786|0.738|0.837|0.80|0.875|0.523 |0.857|0.524 | |jina-embedding-b-en-v1|**0.751**|0.809|0.761|0.856|0.812|0.890|0.606 |0.876|0.594 | |jina-embedding-l-en-v1|0.739|**0.844**|0.778|**0.863**|0.821|0.896|0.566 |**0.882**|0.608 | ## Usage Use with Jina AI Finetuner ```python !pip install finetuner import finetuner model = finetuner.build_model('jinaai/jina-embedding-l-en-v1') embeddings = finetuner.encode( model=model, data=['how is the weather today', 'What is the current weather like today?'] ) print(finetuner.cos_sim(embeddings[0], embeddings[1])) ``` Use with sentence-transformers: ```python from sentence_transformers import SentenceTransformer from sentence_transformers.util import cos_sim sentences = ['how is the weather today', 'What is the current weather like today?'] model = SentenceTransformer('jinaai/jina-embedding-b-en-v1') embeddings = model.encode(sentences) print(cos_sim(embeddings[0], embeddings[1])) ``` ## Fine-tuning Please consider [Finetuner](https://github.com/jina-ai/finetuner). ## Plans 1. The development of `jina-embedding-s-en-v2` is currently underway with two main objectives: improving performance and increasing the maximum sequence length. 2. We are currently working on a bilingual embedding model that combines English and X language. The upcoming model will be called `jina-embedding-s/b/l-de-v1`. ## Contact Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas. ## Citation If you find Jina Embeddings useful in your research, please cite the following paper: ``` latex @misc{günther2023jina, title={Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models}, author={Michael Günther and Louis Milliken and Jonathan Geuter and Georgios Mastrapas and Bo Wang and Han Xiao}, year={2023}, eprint={2307.11224}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```