360Zhinao-search / README.md
NLPArtisan's picture
add reranking-task metrics
52ab8b6 verified
metadata
tags:
  - mteb
  - qihoo360
  - 奇虎360
  - RAG-retrieval
model-index:
  - name: 360Zhinao_search
    results:
      - task:
          type: Reranking
        dataset:
          type: C-MTEB/CMedQAv1-reranking
          name: MTEB CMedQAv1
          config: default
          split: test
          revision: None
        metrics:
          - type: map
            value: 87.004722953844
          - type: mrr
            value: 89.34686507936507
      - task:
          type: Reranking
        dataset:
          type: C-MTEB/CMedQAv2-reranking
          name: MTEB CMedQAv2
          config: default
          split: test
          revision: None
        metrics:
          - type: map
            value: 88.48306990136507
          - type: mrr
            value: 90.57761904761904
      - task:
          type: Reranking
        dataset:
          type: C-MTEB/Mmarco-reranking
          name: MTEB MMarcoReranking
          config: default
          split: dev
          revision: None
        metrics:
          - type: map
            value: 32.40909999537645
          - type: mrr
            value: 31.48690476190476
      - task:
          type: Reranking
        dataset:
          type: C-MTEB/T2Reranking
          name: MTEB T2Reranking
          config: default
          split: dev
          revision: None
        metrics:
          - type: map
            value: 67.80300509862872
          - type: mrr
            value: 78.14543234355354
      - task:
          type: Retrieval
        dataset:
          type: C-MTEB/CmedqaRetrieval
          name: MTEB CmedqaRetrieval
          config: default
          split: dev
          revision: None
        metrics:
          - type: map_at_1
            value: 27.171
          - type: map_at_10
            value: 40.109
          - type: map_at_100
            value: 41.937999999999995
          - type: map_at_1000
            value: 42.051
          - type: map_at_3
            value: 35.882999999999996
          - type: map_at_5
            value: 38.22
          - type: mrr_at_1
            value: 41.285
          - type: mrr_at_10
            value: 49.247
          - type: mrr_at_100
            value: 50.199000000000005
          - type: mrr_at_1000
            value: 50.245
          - type: mrr_at_3
            value: 46.837
          - type: mrr_at_5
            value: 48.223
          - type: ndcg_at_1
            value: 41.285
          - type: ndcg_at_10
            value: 46.727000000000004
          - type: ndcg_at_100
            value: 53.791
          - type: ndcg_at_1000
            value: 55.706
          - type: ndcg_at_3
            value: 41.613
          - type: ndcg_at_5
            value: 43.702999999999996
          - type: precision_at_1
            value: 41.285
          - type: precision_at_10
            value: 10.34
          - type: precision_at_100
            value: 1.6019999999999999
          - type: precision_at_1000
            value: 0.184
          - type: precision_at_3
            value: 23.423
          - type: precision_at_5
            value: 16.914
          - type: recall_at_1
            value: 27.171
          - type: recall_at_10
            value: 57.04900000000001
          - type: recall_at_100
            value: 86.271
          - type: recall_at_1000
            value: 99.02300000000001
          - type: recall_at_3
            value: 41.528
          - type: recall_at_5
            value: 48.162
      - task:
          type: Retrieval
        dataset:
          type: C-MTEB/CovidRetrieval
          name: MTEB CovidRetrieval
          config: default
          split: dev
          revision: None
        metrics:
          - type: map_at_1
            value: 73.762
          - type: map_at_10
            value: 81.663
          - type: map_at_100
            value: 81.87100000000001
          - type: map_at_1000
            value: 81.877
          - type: map_at_3
            value: 80.10199999999999
          - type: map_at_5
            value: 81.162
          - type: mrr_at_1
            value: 74.078
          - type: mrr_at_10
            value: 81.745
          - type: mrr_at_100
            value: 81.953
          - type: mrr_at_1000
            value: 81.959
          - type: mrr_at_3
            value: 80.25999999999999
          - type: mrr_at_5
            value: 81.266
          - type: ndcg_at_1
            value: 73.973
          - type: ndcg_at_10
            value: 85.021
          - type: ndcg_at_100
            value: 85.884
          - type: ndcg_at_1000
            value: 86.02300000000001
          - type: ndcg_at_3
            value: 82.03399999999999
          - type: ndcg_at_5
            value: 83.905
          - type: precision_at_1
            value: 73.973
          - type: precision_at_10
            value: 9.631
          - type: precision_at_100
            value: 1
          - type: precision_at_1000
            value: 0.101
          - type: precision_at_3
            value: 29.329
          - type: precision_at_5
            value: 18.546000000000003
          - type: recall_at_1
            value: 73.762
          - type: recall_at_10
            value: 95.258
          - type: recall_at_100
            value: 98.946
          - type: recall_at_1000
            value: 100
          - type: recall_at_3
            value: 87.46000000000001
          - type: recall_at_5
            value: 91.93900000000001
      - task:
          type: Retrieval
        dataset:
          type: C-MTEB/DuRetrieval
          name: MTEB DuRetrieval
          config: default
          split: dev
          revision: None
        metrics:
          - type: map_at_1
            value: 25.967000000000002
          - type: map_at_10
            value: 79.928
          - type: map_at_100
            value: 82.76400000000001
          - type: map_at_1000
            value: 82.794
          - type: map_at_3
            value: 54.432
          - type: map_at_5
            value: 69.246
          - type: mrr_at_1
            value: 89
          - type: mrr_at_10
            value: 92.81
          - type: mrr_at_100
            value: 92.857
          - type: mrr_at_1000
            value: 92.86
          - type: mrr_at_3
            value: 92.467
          - type: mrr_at_5
            value: 92.67699999999999
          - type: ndcg_at_1
            value: 89
          - type: ndcg_at_10
            value: 87.57000000000001
          - type: ndcg_at_100
            value: 90.135
          - type: ndcg_at_1000
            value: 90.427
          - type: ndcg_at_3
            value: 84.88900000000001
          - type: ndcg_at_5
            value: 84.607
          - type: precision_at_1
            value: 89
          - type: precision_at_10
            value: 42.245
          - type: precision_at_100
            value: 4.8340000000000005
          - type: precision_at_1000
            value: 0.49
          - type: precision_at_3
            value: 75.883
          - type: precision_at_5
            value: 64.88000000000001
          - type: recall_at_1
            value: 25.967000000000002
          - type: recall_at_10
            value: 89.79599999999999
          - type: recall_at_100
            value: 98.042
          - type: recall_at_1000
            value: 99.61
          - type: recall_at_3
            value: 57.084
          - type: recall_at_5
            value: 74.763
      - task:
          type: Retrieval
        dataset:
          type: C-MTEB/EcomRetrieval
          name: MTEB EcomRetrieval
          config: default
          split: dev
          revision: None
        metrics:
          - type: map_at_1
            value: 53.6
          - type: map_at_10
            value: 63.94800000000001
          - type: map_at_100
            value: 64.37899999999999
          - type: map_at_1000
            value: 64.39200000000001
          - type: map_at_3
            value: 61.683
          - type: map_at_5
            value: 63.078
          - type: mrr_at_1
            value: 53.6
          - type: mrr_at_10
            value: 63.94800000000001
          - type: mrr_at_100
            value: 64.37899999999999
          - type: mrr_at_1000
            value: 64.39200000000001
          - type: mrr_at_3
            value: 61.683
          - type: mrr_at_5
            value: 63.078
          - type: ndcg_at_1
            value: 53.6
          - type: ndcg_at_10
            value: 68.904
          - type: ndcg_at_100
            value: 71.019
          - type: ndcg_at_1000
            value: 71.345
          - type: ndcg_at_3
            value: 64.30799999999999
          - type: ndcg_at_5
            value: 66.8
          - type: precision_at_1
            value: 53.6
          - type: precision_at_10
            value: 8.44
          - type: precision_at_100
            value: 0.943
          - type: precision_at_1000
            value: 0.097
          - type: precision_at_3
            value: 23.967
          - type: precision_at_5
            value: 15.58
          - type: recall_at_1
            value: 53.6
          - type: recall_at_10
            value: 84.39999999999999
          - type: recall_at_100
            value: 94.3
          - type: recall_at_1000
            value: 96.8
          - type: recall_at_3
            value: 71.89999999999999
          - type: recall_at_5
            value: 77.9
      - task:
          type: Retrieval
        dataset:
          type: C-MTEB/MMarcoRetrieval
          name: MTEB MMarcoRetrieval
          config: default
          split: dev
          revision: None
        metrics:
          - type: map_at_1
            value: 71.375
          - type: map_at_10
            value: 80.05600000000001
          - type: map_at_100
            value: 80.28699999999999
          - type: map_at_1000
            value: 80.294
          - type: map_at_3
            value: 78.479
          - type: map_at_5
            value: 79.51899999999999
          - type: mrr_at_1
            value: 73.739
          - type: mrr_at_10
            value: 80.535
          - type: mrr_at_100
            value: 80.735
          - type: mrr_at_1000
            value: 80.742
          - type: mrr_at_3
            value: 79.212
          - type: mrr_at_5
            value: 80.059
          - type: ndcg_at_1
            value: 73.739
          - type: ndcg_at_10
            value: 83.321
          - type: ndcg_at_100
            value: 84.35000000000001
          - type: ndcg_at_1000
            value: 84.542
          - type: ndcg_at_3
            value: 80.401
          - type: ndcg_at_5
            value: 82.107
          - type: precision_at_1
            value: 73.739
          - type: precision_at_10
            value: 9.878
          - type: precision_at_100
            value: 1.039
          - type: precision_at_1000
            value: 0.106
          - type: precision_at_3
            value: 30.053
          - type: precision_at_5
            value: 18.953999999999997
          - type: recall_at_1
            value: 71.375
          - type: recall_at_10
            value: 92.84599999999999
          - type: recall_at_100
            value: 97.49799999999999
          - type: recall_at_1000
            value: 98.992
          - type: recall_at_3
            value: 85.199
          - type: recall_at_5
            value: 89.22
      - task:
          type: Retrieval
        dataset:
          type: C-MTEB/MedicalRetrieval
          name: MTEB MedicalRetrieval
          config: default
          split: dev
          revision: None
        metrics:
          - type: map_at_1
            value: 55.60000000000001
          - type: map_at_10
            value: 61.035
          - type: map_at_100
            value: 61.541999999999994
          - type: map_at_1000
            value: 61.598
          - type: map_at_3
            value: 59.683
          - type: map_at_5
            value: 60.478
          - type: mrr_at_1
            value: 55.60000000000001
          - type: mrr_at_10
            value: 61.035
          - type: mrr_at_100
            value: 61.541999999999994
          - type: mrr_at_1000
            value: 61.598
          - type: mrr_at_3
            value: 59.683
          - type: mrr_at_5
            value: 60.478
          - type: ndcg_at_1
            value: 55.60000000000001
          - type: ndcg_at_10
            value: 63.686
          - type: ndcg_at_100
            value: 66.417
          - type: ndcg_at_1000
            value: 67.92399999999999
          - type: ndcg_at_3
            value: 60.951
          - type: ndcg_at_5
            value: 62.388
          - type: precision_at_1
            value: 55.60000000000001
          - type: precision_at_10
            value: 7.199999999999999
          - type: precision_at_100
            value: 0.8540000000000001
          - type: precision_at_1000
            value: 0.097
          - type: precision_at_3
            value: 21.532999999999998
          - type: precision_at_5
            value: 13.62
          - type: recall_at_1
            value: 55.60000000000001
          - type: recall_at_10
            value: 72
          - type: recall_at_100
            value: 85.39999999999999
          - type: recall_at_1000
            value: 97.3
          - type: recall_at_3
            value: 64.60000000000001
          - type: recall_at_5
            value: 68.10000000000001
      - task:
          type: Retrieval
        dataset:
          type: C-MTEB/T2Retrieval
          name: MTEB T2Retrieval
          config: default
          split: dev
          revision: None
        metrics:
          - type: map_at_1
            value: 28.314
          - type: map_at_10
            value: 80.268
          - type: map_at_100
            value: 83.75399999999999
          - type: map_at_1000
            value: 83.80499999999999
          - type: map_at_3
            value: 56.313
          - type: map_at_5
            value: 69.336
          - type: mrr_at_1
            value: 91.96
          - type: mrr_at_10
            value: 93.926
          - type: mrr_at_100
            value: 94
          - type: mrr_at_1000
            value: 94.003
          - type: mrr_at_3
            value: 93.587
          - type: mrr_at_5
            value: 93.804
          - type: ndcg_at_1
            value: 91.96
          - type: ndcg_at_10
            value: 87.12299999999999
          - type: ndcg_at_100
            value: 90.238
          - type: ndcg_at_1000
            value: 90.723
          - type: ndcg_at_3
            value: 88.347
          - type: ndcg_at_5
            value: 87.095
          - type: precision_at_1
            value: 91.96
          - type: precision_at_10
            value: 43.257
          - type: precision_at_100
            value: 5.064
          - type: precision_at_1000
            value: 0.517
          - type: precision_at_3
            value: 77.269
          - type: precision_at_5
            value: 64.89
          - type: recall_at_1
            value: 28.314
          - type: recall_at_10
            value: 85.917
          - type: recall_at_100
            value: 96.297
          - type: recall_at_1000
            value: 98.802
          - type: recall_at_3
            value: 57.75900000000001
          - type: recall_at_5
            value: 72.287
      - task:
          type: Retrieval
        dataset:
          type: C-MTEB/VideoRetrieval
          name: MTEB VideoRetrieval
          config: default
          split: dev
          revision: None
        metrics:
          - type: map_at_1
            value: 65.60000000000001
          - type: map_at_10
            value: 74.502
          - type: map_at_100
            value: 74.864
          - type: map_at_1000
            value: 74.875
          - type: map_at_3
            value: 73.3
          - type: map_at_5
            value: 74.07000000000001
          - type: mrr_at_1
            value: 65.60000000000001
          - type: mrr_at_10
            value: 74.502
          - type: mrr_at_100
            value: 74.864
          - type: mrr_at_1000
            value: 74.875
          - type: mrr_at_3
            value: 73.3
          - type: mrr_at_5
            value: 74.07000000000001
          - type: ndcg_at_1
            value: 65.60000000000001
          - type: ndcg_at_10
            value: 78.091
          - type: ndcg_at_100
            value: 79.838
          - type: ndcg_at_1000
            value: 80.10199999999999
          - type: ndcg_at_3
            value: 75.697
          - type: ndcg_at_5
            value: 77.07000000000001
          - type: precision_at_1
            value: 65.60000000000001
          - type: precision_at_10
            value: 8.9
          - type: precision_at_100
            value: 0.971
          - type: precision_at_1000
            value: 0.099
          - type: precision_at_3
            value: 27.533
          - type: precision_at_5
            value: 17.18
          - type: recall_at_1
            value: 65.60000000000001
          - type: recall_at_10
            value: 89
          - type: recall_at_100
            value: 97.1
          - type: recall_at_1000
            value: 99.1
          - type: recall_at_3
            value: 82.6
          - type: recall_at_5
            value: 85.9
license: apache-2.0
library_name: transformers

Model Introduction

360Zhinao-search uses the self-developed BERT model as the base for multi-task fine-tuning, which has an average score of 75.05 on the Retrieval task on the C-MTEB-Retrieval benchmark, currently ranking first.

C-MTEB-Retrieval leaderboard contains a total of 8 [query, passage] similarity retrieval subtasks in different fields, using NDCG@10 (Normalized Discounted Cumulative Gain @ 10) as the evaluation index.

Model T2Retrieval MMarcoRetrieval DuRetrieval CovidRetrieval CmedqaRetrieval EcomRetrieval MedicalRetrieval VideoRetrieval Avg
360Zhinao-search 87.12 83.32 87.57 85.02 46.73 68.9 63.69 78.09 75.05
AGE_Hybrid 86.88 80.65 89.28 83.66 47.26 69.28 65.94 76.79 74.97
OpenSearch-text-hybrid 86.76 79.93 87.85 84.03 46.56 68.79 65.92 75.43 74.41
piccolo-large-zh-v2 86.14 79.54 89.14 86.78 47.58 67.75 64.88 73.1 74.36
stella-large-zh-v3-1792d 85.56 79.14 87.13 82.44 46.87 68.62 65.18 73.89 73.6

Optimization points

  1. Data filtering: Strictly prevent the C-MTEB-Retrieval test data from leaking, and clean all queries and passages in the test set;
  2. Data source enhancement: Use open source data and LLM synthetic data to improve data diversity;
  3. Negative example mining: Use multiple methods to deeply mine difficult-to-distinguish negative examples to improve information gain;
  4. Training efficiency: multi-machine multi-GPU training + Deepspeed method to optimize GPU memory utilization.

Usage

from typing import cast, List, Dict, Union
from transformers import AutoModel, AutoTokenizer
import torch
import numpy as np

tokenizer = AutoTokenizer.from_pretrained('qihoo360/360Zhinao-search')
model = AutoModel.from_pretrained('qihoo360/360Zhinao-search')
sentences = ['天空是什么颜色的', '天空是蓝色的']
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt', max_length=512)

if __name__ == "__main__":

    with torch.no_grad():
        last_hidden_state = model(**inputs, return_dict=True).last_hidden_state
        embeddings = last_hidden_state[:, 0]
        embeddings = torch.nn.functional.normalize(embeddings, dim=-1)
        embeddings = embeddings.cpu().numpy()

    print("embeddings:")
    print(embeddings)

    cos_sim = np.dot(embeddings[0], embeddings[1])
    print("cos_sim:", cos_sim)

Reference

bge fine-tuning code

C-MTEB official test script

License

The source code of this repository follows the open-source license Apache 2.0.

360​Zhinao open-source models support commercial use. If you wish to use these models or continue training them for commercial purposes, please contact us via email ([email protected]) to apply. For the specific license agreement, please see <<360 Zhinao Open-Source Model License>>.