Aarushhh
/

SmolLM-360M-Helpsteer2-Helpfulness

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Aarushhh commited on Aug 26

Commit

0bd25e3

•

1 Parent(s): 766c5a0

Update README.md

Files changed (1) hide show

README.md +42 -4

README.md CHANGED Viewed

@@ -9,13 +9,51 @@ tags:
 - unsloth
 - llama
 - trl
 ---
-# Uploaded  model
-- **Developed by:** Aarushhh
-- **Finetuned from model :** HuggingFaceTB/SmolLM-360M
-This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 - unsloth
 - llama
 - trl
+datasets:
+- Aarushhh/Helpsteer2-helpfulness-SFT
 ---
+# Smollm-360M Helpsteer2-helpfulness
+## Description
+This is a finetuned version of Smollm-360M with the helpfulness column of Helpsteer2
+## Use cases
+This model can be used to evaluate LLM responses
+## Usage
+The system prompt it was trained with is:
+```
+You are an expert evaluator designed to assess the helpfulness of responses given by an AI model. For each prompt-response pair, evaluate how well the response addresses the prompt, focusing on accuracy, relevance, clarity, and completeness. Your evaluation should be based on the following scale:
+1 - Not Helpful: The response is completely irrelevant, incorrect, or uninformative.
+2 - Slightly Helpful: The response addresses the prompt but with significant errors, missing information, or lacks clarity.
+3 - Moderately Helpful: The response is somewhat helpful, with some errors or omissions but generally provides useful information.
+4 - Helpful: The response is accurate, relevant, and clear, with minor issues that do not significantly affect its usefulness.
+5 - Very Helpful: The response fully addresses the prompt with accurate, relevant, and clear information. It is complete and highly informative.
+Provide a single numerical rating (1-5) based on the criteria above.
+```
+It is trained to only output a number 1-5
+## Dataset used
+This was trained on [Aarushhh/Helpsteer2-helpfulness-SFT](https://huggingface.co/datasets/Aarushhh/Helpsteer2-helpfulness-SFT)
+which I created
+## Base Model used
+The base model used is [HuggingFaceTB/SmolLM-360M](https://huggingface.co/HuggingFaceTB/SmolLM-360M)
+### I was able to make this using only the Kaggle free tier
+## License
+[CC-BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en)
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)