BAAI/bge-reranker-v2-m3 · 关于微调时正负例比例设定的问题 About the ratio of pos and negs

Jul 15

•

bge-reranker is a very good work, and we fine-tune it on our own data. However, we see that the fine-tuning script in FlagEmbedding provides that the ratio of positive and negative cases is 1:15. Considering that the general classification task usually sets the ratio of positive and negative cases to 1:1, is there any experiment for the setting of multiple negative cases? Thank you very much!

bge-reranker是一个非常好的工作，我们打算在自己业务数据上微调它。然而，我们看到FlagEmbedding中的微调脚本提供的是正负例标准是1:15，考虑到一般分类任务将正负例比例设为1:1，多个负例的选取是否有实验验证？非常感谢！

Kaguya-19 changed discussion title from 关于微调时正负例比例设定的问题 to 关于微调时正负例比例设定的问题 About the ratio of pos and negs Jul 15

Shitao

Beijing Academy of Artificial Intelligence org Aug 16

•

edited Aug 16

@Kaguya-19 ，用的分类的交叉熵损失，并不是二分类sigmoid损失，因此不存在正负比例失衡问题。对于对比学习来说，通常负样本越多效果越好。

Kaguya-19 changed discussion status to closed Aug 16