simonhughes22
commited on
Commit
•
845f97b
1
Parent(s):
1a1b26d
Update README.md
Browse files
README.md
CHANGED
@@ -9,8 +9,8 @@ The model was trained on the NLI data and a variety of datasets evaluating summa
|
|
9 |
|
10 |
## Performance
|
11 |
|
12 |
-
* [TRUE Dataset
|
13 |
-
* [SummaC Benchmark
|
14 |
* [AnyScale Ranking Test for Hallucinations](https://www.anyscale.com/blog/llama-2-is-about-as-factually-accurate-as-gpt-4-for-summaries-and-is-30x-cheaper) - 86.6 % Accuracy
|
15 |
|
16 |
## Usage
|
|
|
9 |
|
10 |
## Performance
|
11 |
|
12 |
+
* [TRUE Dataset](https://arxiv.org/pdf/2204.04991.pdf) (Minus Vitamin C, FEVER and PAWS) - 0.872 AUC Score
|
13 |
+
* [SummaC Benchmark](https://aclanthology.org/2022.tacl-1.10.pdf) (Test Split) - 0.764 Balanced Accuracy, 0.831 AUC Score
|
14 |
* [AnyScale Ranking Test for Hallucinations](https://www.anyscale.com/blog/llama-2-is-about-as-factually-accurate-as-gpt-4-for-summaries-and-is-30x-cheaper) - 86.6 % Accuracy
|
15 |
|
16 |
## Usage
|