Model Card for deberta-v3-large-Rationale-to-Score
This repository hosts a version of microsoft/deberta-v3-large
that has been fine-tuned to assess text-based rationales and generate corresponding scores. As shown in the examples, the model processes a given free-text rationale and outputs a numerical score.
For a comprehensive understanding of the training process and methodologies employed, please refer to our detailed research paper: Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring.
If you utilize this model in your research, please acknowledge it by citing our work:
Citation Information
@misc{li2024calibratingllmspreferenceoptimization,
title={Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring},
author={Jiazheng Li and Hainiu Xu and Zhaoyue Sun and Yuxiang Zhou and David West and Cesare Aloisi and Yulan He},
year={2024},
eprint={2406.19949},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2406.19949},
}