Edit model card

Code-290k-6.7B-Instruct

This model is trained on DeepSeek-Coder-6.7B-Instruct. I have used my existing dataset Code-290k-ShareGPT for training purpose. It is trained on around 290000 set of codes. Along with Python, Java, JavaScript, GO, C++, Rust, Ruby, Sql, MySql, R, Julia, Haskell, etc. code with detailed explanation is used for training purpose. This model utilises Alpaca format. Besides code generation it will also give you explanation.

Training:

Entire dataset was trained on 4 x A100 80GB. For 3 epoch, training took 85 hours. DeepSeek-Coder codebase and DeepSpeed was used for training purpose.

This is a full fine tuned model.

Links for quantized models are given below.

Exllama

Exllama v2:Link

Extremely thankful to Bartowski for making Quantized version of the model.

Example Prompt:

This is a conversation with your helpful AI assistant. AI assistant can generate Code in various Programming Languages along with necessary explanation.

### Instruction:
{instruction}

### Response:

You can modify above Prompt as per your requirement. I have used Alpaca format.

I want to say special Thanks to the Open Source community for helping & guiding me to better understand the AI/Model development.

Thank you for your love & support.

Examples

  1. Bayes Theorem - Python

image/png

  1. Fermat's little theorem

image/png

  1. The Arrhenius equation using R

image/png

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 36.64
AI2 Reasoning Challenge (25-Shot) 34.90
HellaSwag (10-Shot) 51.99
MMLU (5-Shot) 34.89
TruthfulQA (0-shot) 41.95
Winogrande (5-shot) 52.64
GSM8k (5-shot) 3.49
Downloads last month
87
Safetensors
Model size
6.74B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ajibawa-2023/Code-290k-6.7B-Instruct

Quantizations
2 models

Dataset used to train ajibawa-2023/Code-290k-6.7B-Instruct

Evaluation results