prometheus-eval
/

prometheus-13b-v1.0

Text2Text Generation

text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

seungone commited on Oct 14, 2023

Commit

3e14f27

•

1 Parent(s): fb83e8b

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -15,6 +15,9 @@ metrics:
 ---
 # TL;DR
 Prometheus is a language model using [Llama-2-Chat](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) as a base model and fine-tuned on 100K feedback within the [Feedback Collection](https://huggingface.co/datasets/kaist-ai/Feedback-Collection).
 Since it was fine-tuned on a large amount of feedback, it is specialized at evaluating long-form responses, outperforming GPT-3.5-Turbo, Llama-2-Chat 70B, and on par with GPT-4 on various benchmarks.

 ---
 # TL;DR
+Prometheus is an alternative of GPT-4 evaluation when doing fine-grained evaluation of a LLM & a Reward model for Reinforcement Learning from Human Feedback (RLHF).
+![plot](./finegrained_eval.JPG)
 Prometheus is a language model using [Llama-2-Chat](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) as a base model and fine-tuned on 100K feedback within the [Feedback Collection](https://huggingface.co/datasets/kaist-ai/Feedback-Collection).
 Since it was fine-tuned on a large amount of feedback, it is specialized at evaluating long-form responses, outperforming GPT-3.5-Turbo, Llama-2-Chat 70B, and on par with GPT-4 on various benchmarks.