--- license: mit datasets: - philipphager/baidu-ultr-pretrain - philipphager/baidu-ultr_uva-mlm-ctr metrics: - log-likelihood - dcg@1 - dcg@3 - dcg@5 - dcg@10 - ndcg@10 - mrr@10 --- # Naive Listwise MonoBERT trained on Baidu-ULTR A flax-based MonoBERT cross encoder trained on the [Baidu-ULTR](https://arxiv.org/abs/2207.03051) dataset with a **listwise softmax cross-entropy loss on clicks**. The loss is called "naive" as we use user clicks as a signal of relevance without any additional position bias correction. For more info, [read our paper](https://arxiv.org/abs/2404.02543) and [find the code for this model here](https://github.com/philipphager/baidu-bert-model). ## Test Results on Baidu-ULTR Expert Annotations | Model | log-likelihood | DCG@1 | DCG@3 | DCG@5 | DCG@10 | nDCG@10 | MRR@10 | |----------------------|----------------|----------|----------|----------|----------|----------|---------| | Pointwise Naive | 0.2268 | 1.6411 | 3.4624 | 4.7520 | 7.2506 | 0.3567 | 0.6089 | | Pointwise IPS | 0.2216 | 1.2948 | 2.8106 | 3.9767 | 6.2961 | 0.3075 | 0.5343 | | Pointwise Two Tower | 0.2176 | 1.6288 | 3.4712 | 4.8220 | 7.4556 | 0.3668 | 0.6071 | | **Listwise Naive** | - | **1.9738** | **4.1609** | **5.6861** | **8.5432** | **0.4091** | **0.6436** | | Listwise IPS | - | 1.7466 | 3.6378 | 4.9797 | 7.5790 | 0.3665 | 0.6112 | | Listwise DLA | - | 1.7954 | 3.8054 | 5.2083 | 7.9342 | 0.3848 | 0.6261 | ## Usage Here is an example of downloading the model and calling it for inference on a mock batch of input data. For more details on how to use the model on the Baidu-ULTR dataset, take a look at our [training](https://github.com/philipphager/baidu-bert-model/blob/main/main.py) and [evaluation scripts](https://github.com/philipphager/baidu-bert-model/blob/main/eval.py) in our code repository. ```Python import jax.numpy as jnp from src.model import ListwiseCrossEncoder model = ListwiseCrossEncoder.from_pretrained( "philipphager/baidu-ultr_uva-bert_naive-listwise", ) # Mock batch following Baidu-ULTR with 4 documents, each with 8 tokens batch = { # Query_id for each document "query_id": jnp.array([1, 1, 1, 1]), # Document position in SERP "positions": jnp.array([1, 2, 3, 4]), # Token ids for: [CLS] Query [SEP] Document "tokens": jnp.array([ [2, 21448, 21874, 21436, 1, 20206, 4012, 2860], [2, 21448, 21874, 21436, 1, 16794, 4522, 2082], [2, 21448, 21874, 21436, 1, 20206, 10082, 9773], [2, 21448, 21874, 21436, 1, 2618, 8520, 2860], ]), # Specify if a token id belongs to the query (0) or document (1) "token_types": jnp.array([ [0, 0, 0, 0, 1, 1, 1, 1], [0, 0, 0, 0, 1, 1, 1, 1], [0, 0, 0, 0, 1, 1, 1, 1], [0, 0, 0, 0, 1, 1, 1, 1], ]), # Marks if a token should be attended to (True) or ignored, e.g., padding tokens (False): "attention_mask": jnp.array([ [True, True, True, True, True, True, True, True], [True, True, True, True, True, True, True, True], [True, True, True, True, True, True, True, True], [True, True, True, True, True, True, True, True], ]), } outputs = model(batch, train=False) print(outputs) ``` ## Reference ``` @inproceedings{Hager2024BaiduULTR, author = {Philipp Hager and Romain Deffayet and Jean-Michel Renders and Onno Zoeter and Maarten de Rijke}, title = {Unbiased Learning to Rank Meets Reality: Lessons from Baidu’s Large-Scale Search Dataset}, booktitle = {Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR`24)}, organization = {ACM}, year = {2024}, } ```