Update README.md

6eb1b23 verified about 2 months ago

2.52 kB

	---
	library_name: transformers
	license: llama2
	base_model:
	- codellama/CodeLlama-70b-Instruct-hf
	pipeline_tag: text-generation
	---

	Converted version of [CodeLlama-70b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf) to 4-bit using bitsandbytes. For more information
	about the model, refer to the model's page.

	Impact on performance

	We evaluated the models using a panel of giga-models (GPT-4o, Gemini Pro 1.5, and Claude-Sonnet 3.5). The scoring ranged from 0, indicating a model unsuitable
	for the task, to 5, representing a model that fully met expectations. The evaluation was based on 67 instructions across four programming languages: Python,
	Java, JavaScript, and Pseudo-code. All tests were conducted in a French-language context, and models were heavily penalized if they responded in another language,
	even if the response was technically correct.

	\| model \| score \| # params (Billion) \| size (GB) \|
	\|------------------------------------------------:\|:--------:\|:------------------:\|:---------:\|
	\| gemini-1.5-pro \| 4.51 \| NA \| NA \|
	\| gpt-4o \| 4.51 \| NA \| NA \|
	\| claude3.5-sonnet \| 4.49 \| NA \| NA \|
	\| deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct \| 4.24 \| 15.7 \| 31.4 \|
	\| meta-llama/Meta-Llama-3.1-70B-Instruct \| 4.23 \| 70.06 \| 140.12 \|
	\| cmarkea/Meta-Llama-3.1-70B-Instruct-4bit \| 4.14 \| 70.06 \| 35.3 \|
	\| cmarkea/Mixtral-8x7B-Instruct-v0.1-4bit \| 3.8 \| 46.7 \| 23.35 \|
	\| meta-llama/Meta-Llama-3.1-8B-Instruct \| 3.73 \| 8.03 \| 16.06 \|
	\| mistralai/Mixtral-8x7B-Instruct-v0.1 \| 3.33 \| 46.7 \| 93.4 \|
	\| codellama/CodeLlama-13b-Instruct-hf \| 3.33 \| 13 \| 26 \|
	\| codellama/CodeLlama-34b-Instruct-hf \| 3.27 \| 33.7 \| 67.4 \|
	\| codellama/CodeLlama-7b-Instruct-hf \| 3.19 \| 6.74 \| 13.48 \|
	\| cmarkea/CodeLlama-34b-Instruct-hf-4bit \| 3.12 \| 33.7 \| 16.85 \|
	\| codellama/CodeLlama-70b-Instruct-hf \| 1.82 \| 69 \| 138 \|
	\| cmarkea/CodeLlama-70b-Instruct-hf-4bit \| 1.64 \| 69 \| 34.5 \|

	---
	library_name: transformers
	license: llama2
	base_model:
	- codellama/CodeLlama-70b-Instruct-hf
	pipeline_tag: text-generation
	---

	Converted version of [CodeLlama-70b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf) to 4-bit using bitsandbytes. For more information
	about the model, refer to the model's page.

	Impact on performance

	We evaluated the models using a panel of giga-models (GPT-4o, Gemini Pro 1.5, and Claude-Sonnet 3.5). The scoring ranged from 0, indicating a model unsuitable
	for the task, to 5, representing a model that fully met expectations. The evaluation was based on 67 instructions across four programming languages: Python,
	Java, JavaScript, and Pseudo-code. All tests were conducted in a French-language context, and models were heavily penalized if they responded in another language,
	even if the response was technically correct.

	\| model \| score \| # params (Billion) \| size (GB) \|
	\|------------------------------------------------:\|:--------:\|:------------------:\|:---------:\|
	\| gemini-1.5-pro \| 4.51 \| NA \| NA \|
	\| gpt-4o \| 4.51 \| NA \| NA \|
	\| claude3.5-sonnet \| 4.49 \| NA \| NA \|
	\| deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct \| 4.24 \| 15.7 \| 31.4 \|
	\| meta-llama/Meta-Llama-3.1-70B-Instruct \| 4.23 \| 70.06 \| 140.12 \|
	\| cmarkea/Meta-Llama-3.1-70B-Instruct-4bit \| 4.14 \| 70.06 \| 35.3 \|
	\| cmarkea/Mixtral-8x7B-Instruct-v0.1-4bit \| 3.8 \| 46.7 \| 23.35 \|
	\| meta-llama/Meta-Llama-3.1-8B-Instruct \| 3.73 \| 8.03 \| 16.06 \|
	\| mistralai/Mixtral-8x7B-Instruct-v0.1 \| 3.33 \| 46.7 \| 93.4 \|
	\| codellama/CodeLlama-13b-Instruct-hf \| 3.33 \| 13 \| 26 \|
	\| codellama/CodeLlama-34b-Instruct-hf \| 3.27 \| 33.7 \| 67.4 \|
	\| codellama/CodeLlama-7b-Instruct-hf \| 3.19 \| 6.74 \| 13.48 \|
	\| cmarkea/CodeLlama-34b-Instruct-hf-4bit \| 3.12 \| 33.7 \| 16.85 \|
	\| codellama/CodeLlama-70b-Instruct-hf \| 1.82 \| 69 \| 138 \|
	\| cmarkea/CodeLlama-70b-Instruct-hf-4bit \| 1.64 \| 69 \| 34.5 \|