Zongxia commited on
Commit
766b554
1 Parent(s): 0f4fd69

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -14,7 +14,7 @@ pipeline_tag: text-classification
14
  [![PyPI version qa-metrics](https://img.shields.io/pypi/v/qa-metrics.svg)](https://pypi.org/project/qa-metrics/)
15
 
16
 
17
- QA-Evaluation-Metrics is a fast and lightweight Python package for evaluating question-answering models. It provides various basic metrics to assess the performance of QA models. Check out our paper [**CFMatcher**](https://arxiv.org/abs/2401.13170), a matching method going beyond token-level matching and is more efficient than LLM matchings but still retains competitive evaluation performance of transformer LLM models.
18
 
19
 
20
  ## Installation
@@ -63,7 +63,7 @@ match_result = f1_match(reference_answer, candidate_answer, threshold=0.5)
63
  print("F1 Match: ", match_result)
64
  ```
65
 
66
- #### CFMatch
67
  ```python
68
  from qa_metrics.cfm import CFMatcher
69
 
@@ -76,13 +76,13 @@ print("Score: %s; bert Match: %s" % (scores, match_result))
76
 
77
  If you find this repo avialable, please cite our paper:
78
  ```bibtex
79
- @misc{li2024cfmatch,
80
- title={CFMatch: Aligning Automated Answer Equivalence Evaluation with Expert Judgments For Open-Domain Question Answering},
81
- author={Zongxia Li and Ishani Mondal and Yijun Liang and Huy Nghiem and Jordan Boyd-Graber},
82
- year={2024},
83
- eprint={2401.13170},
84
- archivePrefix={arXiv},
85
- primaryClass={cs.CL}
86
  }
87
  ```
88
 
 
14
  [![PyPI version qa-metrics](https://img.shields.io/pypi/v/qa-metrics.svg)](https://pypi.org/project/qa-metrics/)
15
 
16
 
17
+ QA-Evaluation-Metrics is a fast and lightweight Python package for evaluating question-answering models. It provides various basic metrics to assess the performance of QA models. Check out our paper [**PANDA**](https://arxiv.org/abs/2402.11161), a matching method going beyond token-level matching and is more efficient than LLM matchings but still retains competitive evaluation performance of transformer LLM models.
18
 
19
 
20
  ## Installation
 
63
  print("F1 Match: ", match_result)
64
  ```
65
 
66
+ #### PANDA
67
  ```python
68
  from qa_metrics.cfm import CFMatcher
69
 
 
76
 
77
  If you find this repo avialable, please cite our paper:
78
  ```bibtex
79
+ @misc{li2024panda,
80
+ title={PANDA (Pedantic ANswer-correctness Determination and Adjudication):Improving Automatic Evaluation for Question Answering and Text Generation},
81
+ author={Zongxia Li and Ishani Mondal and Yijun Liang and Huy Nghiem and Jordan Lee Boyd-Graber},
82
+ year={2024},
83
+ eprint={2402.11161},
84
+ archivePrefix={arXiv},
85
+ primaryClass={cs.CL}
86
  }
87
  ```
88