Edit model card

CodeBooga-34B-v0.1

This is a merge between the following two models:

  1. Phind-CodeLlama-34B-v2
  2. WizardCoder-Python-34B-V1.0

It was created with the BlockMerge Gradient script, the same one that was used to create MythoMax-L2-13b, and with the same settings. The following YAML was used:

model_path1: "Phind_Phind-CodeLlama-34B-v2_safetensors"
model_path2: "WizardLM_WizardCoder-Python-34B-V1.0_safetensors"
output_model_path: "CodeBooga-34B-v0.1"
operations:
  - operation: lm_head # Single tensor
    filter: "lm_head"
    gradient_values: [0.75]
  - operation: embed_tokens # Single tensor
    filter: "embed_tokens"
    gradient_values: [0.75]
  - operation: self_attn
    filter: "self_attn"
    gradient_values: [0.75, 0.25]
  - operation: mlp
    filter: "mlp"
    gradient_values: [0.25, 0.75]
  - operation: layernorm
    filter: "layernorm"
    gradient_values: [0.5, 0.5]
  - operation: modelnorm # Single tensor
    filter: "model.norm"
    gradient_values: [0.75]

Prompt format

Both base models use the Alpaca format, so it should be used for this one as well.

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
Your instruction

### Response:
Bot reply

### Instruction:
Another instruction

### Response:
Bot reply

Evaluation

I made a quick experiment where I asked a set of 3 Python and 3 Javascript questions (real world, difficult questions with nuance) to the following models:

  1. This one
  2. A second variant generated with model_path1 and model_path2 swapped in the YAML above, which I called CodeBooga-Reversed-34B-v0.1
  3. WizardCoder-Python-34B-V1.0
  4. Phind-CodeLlama-34B-v2

Specifically, I used 4.250b EXL2 quantizations of each. I then sorted the responses for each question by quality, and attributed the following scores:

  • 4th place: 0
  • 3rd place: 1
  • 2nd place: 2
  • 1st place: 4

The resulting cumulative scores were:

  • CodeBooga-34B-v0.1: 22
  • WizardCoder-Python-34B-V1.0: 12
  • Phind-CodeLlama-34B-v2: 7
  • CodeBooga-Reversed-34B-v0.1: 1

CodeBooga-34B-v0.1 performed very well, while its variant performed poorly, so I uploaded the former but not the latter.

Recommended settings

I recommend the Divine Intellect preset for instruction-following models like this, as per the Preset Arena experiment results:

temperature: 1.31
top_p: 0.14
repetition_penalty: 1.17
top_k: 49

Quantized versions

EXL2

A 4.250b EXL2 version of the model can be found here:

https://huggingface.co/oobabooga/CodeBooga-34B-v0.1-EXL2-4.250b

GGUF

TheBloke has kindly provided GGUF quantizations for llama.cpp:

https://huggingface.co/TheBloke/CodeBooga-34B-v0.1-GGUF

Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using LoneStriker/CodeBooga-34B-v0.1-4.0bpw-h6-exl2 1