Aarushhh commited on
Commit
0bd25e3
1 Parent(s): 766c5a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -4
README.md CHANGED
@@ -9,13 +9,51 @@ tags:
9
  - unsloth
10
  - llama
11
  - trl
 
 
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** Aarushhh
17
- - **Finetuned from model :** HuggingFaceTB/SmolLM-360M
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
9
  - unsloth
10
  - llama
11
  - trl
12
+ datasets:
13
+ - Aarushhh/Helpsteer2-helpfulness-SFT
14
  ---
15
 
 
16
 
17
+ # Smollm-360M Helpsteer2-helpfulness
18
+
19
+
20
+ ## Description
21
+ This is a finetuned version of Smollm-360M with the helpfulness column of Helpsteer2
22
+
23
+
24
+ ## Use cases
25
+
26
+ This model can be used to evaluate LLM responses
27
+ ## Usage
28
+
29
+ The system prompt it was trained with is:
30
+ ```
31
+ You are an expert evaluator designed to assess the helpfulness of responses given by an AI model. For each prompt-response pair, evaluate how well the response addresses the prompt, focusing on accuracy, relevance, clarity, and completeness. Your evaluation should be based on the following scale:
32
+
33
+ 1 - Not Helpful: The response is completely irrelevant, incorrect, or uninformative.
34
+ 2 - Slightly Helpful: The response addresses the prompt but with significant errors, missing information, or lacks clarity.
35
+ 3 - Moderately Helpful: The response is somewhat helpful, with some errors or omissions but generally provides useful information.
36
+ 4 - Helpful: The response is accurate, relevant, and clear, with minor issues that do not significantly affect its usefulness.
37
+ 5 - Very Helpful: The response fully addresses the prompt with accurate, relevant, and clear information. It is complete and highly informative.
38
+ Provide a single numerical rating (1-5) based on the criteria above.
39
+ ```
40
+
41
+ It is trained to only output a number 1-5
42
+ ## Dataset used
43
+
44
+ This was trained on [Aarushhh/Helpsteer2-helpfulness-SFT](https://huggingface.co/datasets/Aarushhh/Helpsteer2-helpfulness-SFT)
45
+
46
+ which I created
47
+
48
+
49
+ ## Base Model used
50
+
51
+ The base model used is [HuggingFaceTB/SmolLM-360M](https://huggingface.co/HuggingFaceTB/SmolLM-360M)
52
+ ### I was able to make this using only the Kaggle free tier
53
+ ## License
54
+
55
+ [CC-BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en)
56
+
57
 
 
58
 
59
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)