File size: 6,804 Bytes

---
library_name: transformers
datasets:
- argilla/distilabel-capybara-dpo-7k-binarized
---
# CapyLake-7B-v2-laser

This model is a finetune of [cognitivecomputations/WestLake-7B-v2-Laser](https://huggingface.co/cognitivecomputations/WestLake-7B-v2-laser) on [argilla/distilabel-capybara-dpo-7k-binarized](https://huggingface.co/datasets/argilla/distilabel-capybara-dpo-7k-binarized)

<div align="center">  

![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/kx2uwS_kZ-rTAJiusSrAW.webp)

[<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-dark.png" alt="Built with Distilabel" width="200" height="32"/>](https://github.com/argilla-io/distilabel)

</div>

## Process

+ Realigned the chat template to ChatML 
+ Completed 1 Epoch
+ 5e-05 learning rate
+ Training time was about 2 hours on 1 H100
+ Cost was ~$8

## Code Example

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "macadeliccc/CapyLake-7B-v2-laser"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

text = "Create an idea for a TV show and write a short pilot script"
inputs = tokenizer(text, return_tensors="pt")

# Adding hyperparameters to the generation call
outputs = model.generate(
    **inputs,
    max_new_tokens=4096,  # Controls the maximum length of the new tokens created
    temperature=0.7,  # Adjust for creativity (lower is less random)
    top_k=50,  # Keeps the top k tokens for sampling
    top_p=0.95,  # Uses nucleus sampling with this cumulative probability
    num_return_sequences=1,  # Number of sequences to generate
    no_repeat_ngram_size=2,  # Prevents repeating n-grams to ensure diversity
    early_stopping=True  # Stops generation when all sequences reach the EOS token
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Other Capy Models
 
SOLAR-10.7B-Capy-v1.0 is also on the way. There could be more depending on performance!

## Evaluations

|                                     Model                                     |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
|-------------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
|[CapyLake-7B-v2-laser](https://huggingface.co/macadeliccc/CapyLake-7B-v2-laser)|  44.34|  77.77|     68.47|   47.92|  59.62|

### AGIEval
|             Task             |Version| Metric |Value|   |Stderr|
|------------------------------|------:|--------|----:|---|-----:|
|agieval_aqua_rat              |      0|acc     |28.35|±  |  2.83|
|                              |       |acc_norm|25.98|±  |  2.76|
|agieval_logiqa_en             |      0|acc     |38.86|±  |  1.91|
|                              |       |acc_norm|39.02|±  |  1.91|
|agieval_lsat_ar               |      0|acc     |25.22|±  |  2.87|
|                              |       |acc_norm|24.35|±  |  2.84|
|agieval_lsat_lr               |      0|acc     |50.39|±  |  2.22|
|                              |       |acc_norm|51.57|±  |  2.22|
|agieval_lsat_rc               |      0|acc     |65.06|±  |  2.91|
|                              |       |acc_norm|63.94|±  |  2.93|
|agieval_sat_en                |      0|acc     |78.64|±  |  2.86|
|                              |       |acc_norm|78.64|±  |  2.86|
|agieval_sat_en_without_passage|      0|acc     |40.78|±  |  3.43|
|                              |       |acc_norm|40.78|±  |  3.43|
|agieval_sat_math              |      0|acc     |33.64|±  |  3.19|
|                              |       |acc_norm|30.45|±  |  3.11|

Average: 44.34%

### GPT4All
|    Task     |Version| Metric |Value|   |Stderr|
|-------------|------:|--------|----:|---|-----:|
|arc_challenge|      0|acc     |66.89|±  |  1.38|
|             |       |acc_norm|67.49|±  |  1.37|
|arc_easy     |      0|acc     |86.70|±  |  0.70|
|             |       |acc_norm|81.90|±  |  0.79|
|boolq        |      1|acc     |88.10|±  |  0.57|
|hellaswag    |      0|acc     |71.45|±  |  0.45|
|             |       |acc_norm|87.78|±  |  0.33|
|openbookqa   |      0|acc     |39.80|±  |  2.19|
|             |       |acc_norm|49.80|±  |  2.24|
|piqa         |      0|acc     |82.86|±  |  0.88|
|             |       |acc_norm|84.87|±  |  0.84|
|winogrande   |      0|acc     |84.45|±  |  1.02|

Average: 77.77%

### TruthfulQA
|    Task     |Version|Metric|Value|   |Stderr|
|-------------|------:|------|----:|---|-----:|
|truthfulqa_mc|      1|mc1   |53.98|±  |  1.74|
|             |       |mc2   |68.47|±  |  1.53|

Average: 68.47%

### Bigbench

|                      Task                      |Version|       Metric        |Value|   |Stderr|
|------------------------------------------------|------:|---------------------|----:|---|-----:|
|bigbench_causal_judgement                       |      0|multiple_choice_grade|59.47|±  |  3.57|
|bigbench_date_understanding                     |      0|multiple_choice_grade|64.50|±  |  2.49|
|bigbench_disambiguation_qa                      |      0|multiple_choice_grade|44.96|±  |  3.10|
|bigbench_geometric_shapes                       |      0|multiple_choice_grade|22.84|±  |  2.22|
|                                                |       |exact_str_match      | 2.79|±  |  0.87|
|bigbench_logical_deduction_five_objects         |      0|multiple_choice_grade|30.80|±  |  2.07|
|bigbench_logical_deduction_seven_objects        |      0|multiple_choice_grade|21.57|±  |  1.56|
|bigbench_logical_deduction_three_objects        |      0|multiple_choice_grade|56.67|±  |  2.87|
|bigbench_movie_recommendation                   |      0|multiple_choice_grade|51.60|±  |  2.24|
|bigbench_navigate                               |      0|multiple_choice_grade|51.00|±  |  1.58|
|bigbench_reasoning_about_colored_objects        |      0|multiple_choice_grade|70.35|±  |  1.02|
|bigbench_ruin_names                             |      0|multiple_choice_grade|51.79|±  |  2.36|
|bigbench_salient_translation_error_detection    |      0|multiple_choice_grade|35.97|±  |  1.52|
|bigbench_snarks                                 |      0|multiple_choice_grade|79.01|±  |  3.04|
|bigbench_sports_understanding                   |      0|multiple_choice_grade|75.66|±  |  1.37|
|bigbench_temporal_sequences                     |      0|multiple_choice_grade|47.90|±  |  1.58|
|bigbench_tracking_shuffled_objects_five_objects |      0|multiple_choice_grade|23.84|±  |  1.21|
|bigbench_tracking_shuffled_objects_seven_objects|      0|multiple_choice_grade|18.00|±  |  0.92|
|bigbench_tracking_shuffled_objects_three_objects|      0|multiple_choice_grade|56.67|±  |  2.87|

Average: 47.92%

Average score: 59.62%

Elapsed time: 01:57:56