|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
# CapyLake-7B-v2-laser |
|
|
|
This model is a finetune of [cognitivecomputations/WestLake-7B-v2-Laser](https://huggingface.co/cognitivecomputations/WestLake-7B-v2-laser) on [argilla/distilabel-capybara-dpo-7k-binarized](https://huggingface.co/datasets/argilla/distilabel-capybara-dpo-7k-binarized) |
|
|
|
<div align="center"> |
|
|
|
![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/kx2uwS_kZ-rTAJiusSrAW.webp) |
|
|
|
[<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-dark.png" alt="Built with Distilabel" width="200" height="32"/>](https://github.com/argilla-io/distilabel) |
|
|
|
</div> |
|
|
|
## Process |
|
|
|
+ Realigned the chat template to ChatML |
|
+ Completed 1 Epoch |
|
+ 5e-05 learning rate |
|
+ Training time was about 2 hours on 1 H100 |
|
+ Cost was ~$8 |
|
|
|
## Code Example |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_id = "macadeliccc/CapyLake-7B-v2-laser" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
|
|
text = "Create an idea for a TV show and write a short pilot script" |
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
|
# Adding hyperparameters to the generation call |
|
outputs = model.generate( |
|
**inputs, |
|
max_new_tokens=4096, # Controls the maximum length of the new tokens created |
|
temperature=0.7, # Adjust for creativity (lower is less random) |
|
top_k=50, # Keeps the top k tokens for sampling |
|
top_p=0.95, # Uses nucleus sampling with this cumulative probability |
|
num_return_sequences=1, # Number of sequences to generate |
|
no_repeat_ngram_size=2, # Prevents repeating n-grams to ensure diversity |
|
early_stopping=True # Stops generation when all sequences reach the EOS token |
|
) |
|
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Other Capy Models |
|
|
|
SOLAR-10.7B-Capy-v1.0 is also on the way. There could be more depending on performance! |
|
|
|
## Evaluations |
|
|
|
TODO |
|
|
|
|