macadeliccc's picture
Update README.md
0fc374c verified
---
library_name: transformers
datasets:
- argilla/distilabel-capybara-dpo-7k-binarized
---
# CapyLake-7B-v2-laser
This model is a finetune of [cognitivecomputations/WestLake-7B-v2-Laser](https://huggingface.co/cognitivecomputations/WestLake-7B-v2-laser) on [argilla/distilabel-capybara-dpo-7k-binarized](https://huggingface.co/datasets/argilla/distilabel-capybara-dpo-7k-binarized)
<div align="center">
![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/kx2uwS_kZ-rTAJiusSrAW.webp)
[<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-dark.png" alt="Built with Distilabel" width="200" height="32"/>](https://github.com/argilla-io/distilabel)
</div>
## Process
+ Realigned the chat template to ChatML
+ Completed 1 Epoch
+ 5e-05 learning rate
+ Training time was about 2 hours on 1 H100
+ Cost was ~$8
## Code Example
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "macadeliccc/CapyLake-7B-v2-laser"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
text = "Create an idea for a TV show and write a short pilot script"
inputs = tokenizer(text, return_tensors="pt")
# Adding hyperparameters to the generation call
outputs = model.generate(
**inputs,
max_new_tokens=4096, # Controls the maximum length of the new tokens created
temperature=0.7, # Adjust for creativity (lower is less random)
top_k=50, # Keeps the top k tokens for sampling
top_p=0.95, # Uses nucleus sampling with this cumulative probability
num_return_sequences=1, # Number of sequences to generate
no_repeat_ngram_size=2, # Prevents repeating n-grams to ensure diversity
early_stopping=True # Stops generation when all sequences reach the EOS token
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Other Capy Models
SOLAR-10.7B-Capy-v1.0 is also on the way. There could be more depending on performance!
## Evaluations
| Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
|-------------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
|[CapyLake-7B-v2-laser](https://huggingface.co/macadeliccc/CapyLake-7B-v2-laser)| 44.34| 77.77| 68.47| 47.92| 59.62|
### AGIEval
| Task |Version| Metric |Value| |Stderr|
|------------------------------|------:|--------|----:|---|-----:|
|agieval_aqua_rat | 0|acc |28.35|± | 2.83|
| | |acc_norm|25.98|± | 2.76|
|agieval_logiqa_en | 0|acc |38.86|± | 1.91|
| | |acc_norm|39.02|± | 1.91|
|agieval_lsat_ar | 0|acc |25.22|± | 2.87|
| | |acc_norm|24.35|± | 2.84|
|agieval_lsat_lr | 0|acc |50.39|± | 2.22|
| | |acc_norm|51.57|± | 2.22|
|agieval_lsat_rc | 0|acc |65.06|± | 2.91|
| | |acc_norm|63.94|± | 2.93|
|agieval_sat_en | 0|acc |78.64|± | 2.86|
| | |acc_norm|78.64|± | 2.86|
|agieval_sat_en_without_passage| 0|acc |40.78|± | 3.43|
| | |acc_norm|40.78|± | 3.43|
|agieval_sat_math | 0|acc |33.64|± | 3.19|
| | |acc_norm|30.45|± | 3.11|
Average: 44.34%
### GPT4All
| Task |Version| Metric |Value| |Stderr|
|-------------|------:|--------|----:|---|-----:|
|arc_challenge| 0|acc |66.89|± | 1.38|
| | |acc_norm|67.49|± | 1.37|
|arc_easy | 0|acc |86.70|± | 0.70|
| | |acc_norm|81.90|± | 0.79|
|boolq | 1|acc |88.10|± | 0.57|
|hellaswag | 0|acc |71.45|± | 0.45|
| | |acc_norm|87.78|± | 0.33|
|openbookqa | 0|acc |39.80|± | 2.19|
| | |acc_norm|49.80|± | 2.24|
|piqa | 0|acc |82.86|± | 0.88|
| | |acc_norm|84.87|± | 0.84|
|winogrande | 0|acc |84.45|± | 1.02|
Average: 77.77%
### TruthfulQA
| Task |Version|Metric|Value| |Stderr|
|-------------|------:|------|----:|---|-----:|
|truthfulqa_mc| 1|mc1 |53.98|± | 1.74|
| | |mc2 |68.47|± | 1.53|
Average: 68.47%
### Bigbench
| Task |Version| Metric |Value| |Stderr|
|------------------------------------------------|------:|---------------------|----:|---|-----:|
|bigbench_causal_judgement | 0|multiple_choice_grade|59.47|± | 3.57|
|bigbench_date_understanding | 0|multiple_choice_grade|64.50|± | 2.49|
|bigbench_disambiguation_qa | 0|multiple_choice_grade|44.96|± | 3.10|
|bigbench_geometric_shapes | 0|multiple_choice_grade|22.84|± | 2.22|
| | |exact_str_match | 2.79|± | 0.87|
|bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|30.80|± | 2.07|
|bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|21.57|± | 1.56|
|bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|56.67|± | 2.87|
|bigbench_movie_recommendation | 0|multiple_choice_grade|51.60|± | 2.24|
|bigbench_navigate | 0|multiple_choice_grade|51.00|± | 1.58|
|bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|70.35|± | 1.02|
|bigbench_ruin_names | 0|multiple_choice_grade|51.79|± | 2.36|
|bigbench_salient_translation_error_detection | 0|multiple_choice_grade|35.97|± | 1.52|
|bigbench_snarks | 0|multiple_choice_grade|79.01|± | 3.04|
|bigbench_sports_understanding | 0|multiple_choice_grade|75.66|± | 1.37|
|bigbench_temporal_sequences | 0|multiple_choice_grade|47.90|± | 1.58|
|bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|23.84|± | 1.21|
|bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|18.00|± | 0.92|
|bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|56.67|± | 2.87|
Average: 47.92%
Average score: 59.62%
Elapsed time: 01:57:56