philipphager
/

baidu-ultr_uva-bert_naive-listwise

Transformers

Safetensors

bert

Inference Endpoints

Model card Files Files and versions Community

philipphager commited on Apr 24

Commit

797f9d4

•

1 Parent(s): f770d24

Update README.md

Browse files

Files changed (1) hide show

README.md +34 -2

README.md CHANGED Viewed

@@ -16,14 +16,46 @@ metrics:
 A flax-based MonoBERT cross encoder trained on the Baidu-ULTR dataset with a **listwise softmax cross-entropy loss on clicks**. The loss is called "naive" as we use user clicks as a signal of relevance without any additional position bias correction. For more info, read our paper here.
 ## Usage
-```
 from src.model import ListwiseCrossEncoder
 model = ListwiseCrossEncoder.from_pretrained(
     "philipphager/baidu-ultr_uva-bert_naive-listwise",
 )
-model(batch)
 ```
 ## Test Results on Baidu-ULTR Expert Annotations

 A flax-based MonoBERT cross encoder trained on the Baidu-ULTR dataset with a **listwise softmax cross-entropy loss on clicks**. The loss is called "naive" as we use user clicks as a signal of relevance without any additional position bias correction. For more info, read our paper here.
 ## Usage
+```Python
+import jax.numpy as jnp
 from src.model import ListwiseCrossEncoder
 model = ListwiseCrossEncoder.from_pretrained(
     "philipphager/baidu-ultr_uva-bert_naive-listwise",
 )
+# Mock batch from Baidu-ULTR with 4 documents, each with 32 tokens
+batch = {
+    # Query_id for each document
+    "query_id": jnp.array([1, 1, 1, 1]),
+    # Document position in SERP
+    "positions": jnp.array([1, 2, 3, 4]),
+    # Token ids for each query/document combination
+    "tokens": jnp.array([
+        [2, 21448, 21874, 21436, 2860, 5996, 9526, 15035, 2677, 21446, 21401, 21401, 1, 20206, 4012, 2860, 5996, 9526, 10966, 11858, 15035, 2677, 21446, 21401, 21401, 10092, 250, 8547, 7936, 2677, 1, 21874],
+        [2, 21448, 21874, 21436, 2860, 5996, 9526, 15035, 2677, 21446, 21401, 21401, 1, 16794, 4522, 2082, 2860, 16923, 3186, 15035, 2677, 21446, 21401, 21401, 10092, 21448, 19087, 480, 21449, 21401, 8747, 21436],
+        [2, 21448, 21874, 21436, 2860, 5996, 9526, 15035, 2677, 21446, 21401, 21401, 1, 20206, 10082, 9773, 6164, 8825, 2860, 5996, 9526, 15035, 2677, 21446, 21401, 21401, 10092, 21455, 4516, 2049, 20167, 15035],
+        [2, 21448, 21874, 21436, 2860, 5996, 9526, 15035, 2677, 21446, 21401, 21401, 1, 2618, 8520, 2860, 5996, 9526, 15035, 2677, 21446, 21401, 21401, 10092, 21455, 2618, 8520, 2860, 5996, 9526, 21446, 21401],
+    ]),
+    # Specify if a token id belongs to the query (0) or document (1)
+    "token_types": jnp.array([
+        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
+        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
+        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
+        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
+    ]),
+    # Marks if a token should be attended to (True) or ignored, e.g., padding tokens (False):
+    "attention_mask": jnp.array([
+        [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True],
+        [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True],
+        [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True],
+        [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True],
+    ]),
+}
+outputs = model(batch)
+print(outputs)
 ```
 ## Test Results on Baidu-ULTR Expert Annotations