Update README.md
Browse files
README.md
CHANGED
@@ -97,7 +97,7 @@ It achieves the following results on the evaluation set:
|
|
97 |
|
98 |
## Model description
|
99 |
|
100 |
-
For more information, see the
|
101 |
|
102 |
## Intended uses & limitations
|
103 |
|
@@ -116,7 +116,7 @@ messages = [
|
|
116 |
{"role": "user", "content": "Wie viele Hände hat ein normaler Mensch?"}
|
117 |
]
|
118 |
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt", add_generation_prompt=True).to(device)
|
119 |
-
outputs = model.generate(inputs, max_new_tokens=256, do_sample=True, temperature=0.
|
120 |
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
|
121 |
```
|
122 |
## Training and evaluation data
|
|
|
97 |
|
98 |
## Model description
|
99 |
|
100 |
+
For more information, see the model card of the [base model](https://huggingface.co/LemiSt/SmolLM-135M-de). This adapter was trained using qlora at rank 32 with alpha 16, applying a dataset of around 200k german chat samples for two epochs.
|
101 |
|
102 |
## Intended uses & limitations
|
103 |
|
|
|
116 |
{"role": "user", "content": "Wie viele Hände hat ein normaler Mensch?"}
|
117 |
]
|
118 |
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt", add_generation_prompt=True).to(device)
|
119 |
+
outputs = model.generate(inputs, max_new_tokens=256, do_sample=True, temperature=0.4, top_p=0.9, repetition_penalty=1.1, top_k=512)
|
120 |
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
|
121 |
```
|
122 |
## Training and evaluation data
|