Anthropic/discrim-eval
Viewer
•
Updated
•
18.9k
•
776
•
43
Benchmarks and other datasets that can be used to evaluate bias in healthcare settings.
Note Eval dataset for various situations; contains 270 scenarios with the word "patient".
Note Multilingual (English, Spanish, Hindi, Chinese) dataset of responses to various longform patient QA datasets.
Note Dataset to evaluate the performance of LLMs to extract social determinants of health (SDOH).