What threshold should I use when filtering those irrelevant passages out against the sigmoid scores
I'm implementing a RAG application and the first step is to find the related documents based on user's query. I'm using "bge-reranker-v2-m3" to compute relevance scores against the query and the title for documents and use "sigmoid" to map the float. Now I need to set the threshold filter those related ones.
For the query and passages mentioned in the model card, let say
query = "what is panda?"
passages = [
"hi",
"The giant panda, sometimes called a panda bear or simply panda, is a bear species endemic to China."
]
I got scores [0.0003, 0.9953] which looks great. So I set the threshold to 0.5, which is used widely I think.
But when I tested against my query and passages, I got very low scores even they should be strongly related. For example
query = "请问政府工作报告中,粮食产量如何?"
passages = [
"政府工作报告 本文档包含了咱们国家历年的政府工作报告,包括粮食、教育、科技等信息。",
"测试报告 本文档包含了回归测试的详细结果和分析",
"格式文档 本文档包含了所有审批的格式文档,包含采购报告、周报、台账等"
]
I got scores [0.2093, 0.0001, 0.0005]. This means if I set threshold to 0.5 all passages will be filtered out but actually the first one looks strongly related with my query.
May I ask some questions and if anyone can help me:
- In the second case, why the score was very low. Is there any tricks or something to optimize the accuracy result?
- From my result set the threshold to 0.5 might not be a good idea but which number should I choose? 0.2, or even lower?
Thanks,
Shaun
@Shitao 想詢問相同的問題
Similar question, I can get 0.5 with bce-reranker-base_v1
but only 0.2 with bge-reranker-v2-m3
, huge difference!