Edit model card

speechless-coding-7b-16k-tora

Use the following dataset to fine-tune llm_agents/tora-code-7b-v1.0 in order to improve the model's reasoning and planning abilities.

context window length: 16,384 prompt_type = "alpaca" max_tokens > 128 && < 16384

Total 177,333 samples 316 MB

  • jondurbin/airoboros-2.2: Filter categories related to coding, reasoning and planning. 21,923 samples.
  • Open-Orca/OpenOrca: Filter the 'cot' category in 1M GPT4 dataset. 62,973 samples.
  • garage-bAInd/Open-Platypus: 100%, 22,760 samples.
  • WizardLM/WizardLM_evol_instruct_V2_196k: Coding coversation part. 30,081 samples
  • TokenBender/python_eval_instruct_51k: “python” in output .39,596 samples

50 samples/T=0.2/MaxTokens=512/Top_P=0.95

Code: https://github.com/uukuguy/speechless

HumanEval

Metric Value
humaneval-python 52.44

Big Code Models Leaderboard

CodeLlama-34B-Python: 53.29

CodeLlama-34B-Instruct: 50.79

                    CodeLlama-13B-Instruct: 50.6

                                            CodeLlama-34B: 45.11

                                            CodeLlama-13B-Python: 42.89

                                            CodeLlama-13B: 35.07

MultiPL-E

                                            | Metric | Value |
                                            | --- | --- |
                                            | python | 55.96 |
                                            | java | 37.84 |
                                            | javascript | 46.93 |
                                            | cpp | 37.48 |
                                            | rust | 29.01 |
                                            | go | 28.99 |
                                            |  sh | 12.11 |
                                            | julia | 31.47 |
                                            | typescript | 47.80 |

LMEval

                                            [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
                                            | Metric | Value |
                                            | --- | --- |
                                            | ARC | |
                                            | HellaSwag | |
                                            | MMLU | |
                                            | TruthfulQA |  |
                                            | Average |  |

Parameters

                                            | | |
                                            |------ | ------ |
                                            | lr | 2e-4 |
                                            | lr_scheduler_type | cosine |
                                            | weight_decay | 0.0 |
                                            | optim | paged_adamw_8bit |
                                            | flash_attention | True |
                                            | rerope | False |
                                            | max_new_tokens | 16384 |
                                            | num_train_epochs | 2 |
                                            | bits | 4 |
                                            | lora_r | 64 |
                                            | lora_alpha | 256 |
                                            | lora_dropout | 0.05 |
                                            | double_quant | True |
                                            | quant_type | nf4 |
                                            | dataset_format | sharegpt |
                                            | mini_batch_size | 2 |
                                            | grandient_accumulation_steps | 32 |
                                            | bf16 | True |

                                            A100-40G x 4
Downloads last month
1,308
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train speechlessai/speechless-coding-7b-16k-tora

Evaluation results