chenxwh
/

AVeriTeC

Model card Files Files and versions Community

chenxwh commited on Jul 15

Commit

bc925cd

•

1 Parent(s): 6a2dc59

Update README.md

Browse files

Files changed (1) hide show

README.md +31 -2

README.md CHANGED Viewed

@@ -9,6 +9,8 @@ Data, knowledge store and source code to reproduce the baseline experiments for
 ## NEWS:
  - 19.04.2024: The submisstion page (with eval.ai) for the shared-task is alive, you can participate by submitting your predictions [here](https://eval.ai/web/challenges/challenge-page/2285/overview)!
 ## Dataset
 The training and dev dataset can be found under [data](https://huggingface.co/chenxwh/AVeriTeC/tree/main/data). Test data will be released at a later date. Each claim follows the following structure:
@@ -119,13 +121,40 @@ Then evaluate the veracity prediction performance with (see [evaluate_veracity.p
 python -m src.prediction.evaluate_veracity
 ```
-The result for dev and the test set below. We recommend using 0.25 as cut-off score for evaluating the relevance of the evidence.
 | Model             | Split	| Q only | Q + A | Veracity @ 0.2 | @ 0.25 | @ 0.3 |
 |-------------------|-------|--------|-------|----------------|--------|-------|
 | AVeriTeC-BLOOM-7b | dev	|  0.240 | 0.185 | 	    0.186     |  0.092 | 0.050 |
 | AVeriTeC-BLOOM-7b | test	|  0.248 | 0.185 |  	0.176     |  0.109 | 0.059 |
 ## Citation
 If you find AVeriTeC useful for your research and applications, please cite us using this BibTeX:
 ```bibtex

 ## NEWS:
  - 19.04.2024: The submisstion page (with eval.ai) for the shared-task is alive, you can participate by submitting your predictions [here](https://eval.ai/web/challenges/challenge-page/2285/overview)!
+ - 15.07.2024: To facilitate human evaluation we now ask the submission files to include a `scraped_text' field in your submission file, have a look in []() for more information!
 ## Dataset
 The training and dev dataset can be found under [data](https://huggingface.co/chenxwh/AVeriTeC/tree/main/data). Test data will be released at a later date. Each claim follows the following structure:
 python -m src.prediction.evaluate_veracity
 ```
 | Model             | Split	| Q only | Q + A | Veracity @ 0.2 | @ 0.25 | @ 0.3 |
 |-------------------|-------|--------|-------|----------------|--------|-------|
 | AVeriTeC-BLOOM-7b | dev	|  0.240 | 0.185 | 	    0.186     |  0.092 | 0.050 |
 | AVeriTeC-BLOOM-7b | test	|  0.248 | 0.185 |  	0.176     |  0.109 | 0.059 |
+## Format for submission files
+To facilitate human evaluation, the submission file should include the text of the evidence documents used, retrieved through the `url` field. If external knowledge is utilized, please provide the scraped text. If our provided knowledge store is used, this can be achieved by running the following code block (see [veracity_with_scraped_text.py](https://huggingface.co/chenxwh/AVeriTeC/blob/main/src/prediction/veracity_with_scraped_text.py) for adding the text to the previous prediction file. An example output for the dev set is [here](https://huggingface.co/chenxwh/AVeriTeC/blob/main/data_store/dev_veracity_prediction_for_submission.json).
+```bash
+python -m src.prediction.veracity_with_scraped_text --knowledge_store_dir <directory of the the knowledge store json files>
+```
+Each line of the final submission file is a json object with the following information:
+```json
+{
+  "claim_id": "The ID of the sample.",
+  "claim": "The claim text itself.",
+  "pred_label": "The predicted label of the claim.",
+  "evidence": [
+    {
+      "question": "The text of the generated question.",
+      "answer": "The text of the answer to the generated question.",
+      "url": "The source URL for the answer.",
+      "scraped_text": "The text scraped from the URL."
+    }
+  ]
+}
+```
 ## Citation
 If you find AVeriTeC useful for your research and applications, please cite us using this BibTeX:
 ```bibtex