Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,8 @@ Data, knowledge store and source code to reproduce the baseline experiments for
|
|
9 |
|
10 |
## NEWS:
|
11 |
- 19.04.2024: The submisstion page (with eval.ai) for the shared-task is alive, you can participate by submitting your predictions [here](https://eval.ai/web/challenges/challenge-page/2285/overview)!
|
|
|
|
|
12 |
|
13 |
## Dataset
|
14 |
The training and dev dataset can be found under [data](https://huggingface.co/chenxwh/AVeriTeC/tree/main/data). Test data will be released at a later date. Each claim follows the following structure:
|
@@ -119,13 +121,40 @@ Then evaluate the veracity prediction performance with (see [evaluate_veracity.p
|
|
119 |
python -m src.prediction.evaluate_veracity
|
120 |
```
|
121 |
|
122 |
-
The result for dev and the test set below. We recommend using 0.25 as cut-off score for evaluating the relevance of the evidence.
|
123 |
-
|
124 |
| Model | Split | Q only | Q + A | Veracity @ 0.2 | @ 0.25 | @ 0.3 |
|
125 |
|-------------------|-------|--------|-------|----------------|--------|-------|
|
126 |
| AVeriTeC-BLOOM-7b | dev | 0.240 | 0.185 | 0.186 | 0.092 | 0.050 |
|
127 |
| AVeriTeC-BLOOM-7b | test | 0.248 | 0.185 | 0.176 | 0.109 | 0.059 |
|
128 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
129 |
## Citation
|
130 |
If you find AVeriTeC useful for your research and applications, please cite us using this BibTeX:
|
131 |
```bibtex
|
|
|
9 |
|
10 |
## NEWS:
|
11 |
- 19.04.2024: The submisstion page (with eval.ai) for the shared-task is alive, you can participate by submitting your predictions [here](https://eval.ai/web/challenges/challenge-page/2285/overview)!
|
12 |
+
- 15.07.2024: To facilitate human evaluation we now ask the submission files to include a `scraped_text' field in your submission file, have a look in []() for more information!
|
13 |
+
|
14 |
|
15 |
## Dataset
|
16 |
The training and dev dataset can be found under [data](https://huggingface.co/chenxwh/AVeriTeC/tree/main/data). Test data will be released at a later date. Each claim follows the following structure:
|
|
|
121 |
python -m src.prediction.evaluate_veracity
|
122 |
```
|
123 |
|
|
|
|
|
124 |
| Model | Split | Q only | Q + A | Veracity @ 0.2 | @ 0.25 | @ 0.3 |
|
125 |
|-------------------|-------|--------|-------|----------------|--------|-------|
|
126 |
| AVeriTeC-BLOOM-7b | dev | 0.240 | 0.185 | 0.186 | 0.092 | 0.050 |
|
127 |
| AVeriTeC-BLOOM-7b | test | 0.248 | 0.185 | 0.176 | 0.109 | 0.059 |
|
128 |
|
129 |
+
|
130 |
+
## Format for submission files
|
131 |
+
|
132 |
+
To facilitate human evaluation, the submission file should include the text of the evidence documents used, retrieved through the `url` field. If external knowledge is utilized, please provide the scraped text. If our provided knowledge store is used, this can be achieved by running the following code block (see [veracity_with_scraped_text.py](https://huggingface.co/chenxwh/AVeriTeC/blob/main/src/prediction/veracity_with_scraped_text.py) for adding the text to the previous prediction file. An example output for the dev set is [here](https://huggingface.co/chenxwh/AVeriTeC/blob/main/data_store/dev_veracity_prediction_for_submission.json).
|
133 |
+
```bash
|
134 |
+
python -m src.prediction.veracity_with_scraped_text --knowledge_store_dir <directory of the the knowledge store json files>
|
135 |
+
```
|
136 |
+
|
137 |
+
Each line of the final submission file is a json object with the following information:
|
138 |
+
```json
|
139 |
+
{
|
140 |
+
"claim_id": "The ID of the sample.",
|
141 |
+
"claim": "The claim text itself.",
|
142 |
+
"pred_label": "The predicted label of the claim.",
|
143 |
+
"evidence": [
|
144 |
+
{
|
145 |
+
"question": "The text of the generated question.",
|
146 |
+
"answer": "The text of the answer to the generated question.",
|
147 |
+
"url": "The source URL for the answer.",
|
148 |
+
"scraped_text": "The text scraped from the URL."
|
149 |
+
}
|
150 |
+
]
|
151 |
+
}
|
152 |
+
|
153 |
+
```
|
154 |
+
|
155 |
+
|
156 |
+
|
157 |
+
|
158 |
## Citation
|
159 |
If you find AVeriTeC useful for your research and applications, please cite us using this BibTeX:
|
160 |
```bibtex
|