autocodit / README.md

Adding Evaluation Results

afd6d45 verified 8 months ago

5.85 kB

	---
	language:
	- en
	license: apache-2.0
	library_name: transformers
	base_model:
	- HuggingFaceH4/mistral-7b-anthropic
	- ajibawa-2023/Code-Mistral-7B
	- Undi95/BigL-7B
	model-index:
	- name: autocodit
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 66.38
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adowu/autocodit
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 84.82
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adowu/autocodit
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 65.09
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adowu/autocodit
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 59.95
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adowu/autocodit
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 80.51
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adowu/autocodit
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 60.65
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=adowu/autocodit
	name: Open LLM Leaderboard
	---

	# AUTOCODIT

	# Description

	This model represents an innovative fusion of three cutting-edge language models: BigL-7B, Code-Mistral-7B, and mistral-7b-anthropic, leveraging the strengths of each to create a more powerful and versatile tool. The integration process employs the TIES merge method, meticulously combining these models to enhance performance and adaptability across a broad spectrum of natural language processing tasks.

	Creation Process

	The model was crafted through a strategic merging process, utilizing the TIES merge method. This approach was chosen for its effectiveness in preserving the unique capabilities of each constituent model while ensuring seamless interoperability. The base model for this fusion was HuggingFaceH4/mistral-7b-anthropic, selected for its robust architecture and performance.

	The merge parameters were carefully calibrated to achieve the optimal balance between the models, with the following configuration:
	- BigL-7B was integrated with a density of 0.9 and a weight of 0.8, contributing its extensive language understanding and generation capabilities.
	- Code-Mistral-7B was incorporated with a density of 0.7 and a weight of 0.7, enhancing the model's proficiency in code-related tasks and technical language comprehension.
	- mistral-7b-anthropic served as the foundation, with its parameters set to a density of 0.9 and a weight of 0.8, ensuring the model's general language processing abilities remained at the forefront.

	Features
	- Model Type: `MistralForCausalLM`
	- Vocabulary Size: 32,000 tokens, encompassing a wide array of linguistic elements for comprehensive language coverage.
	- Maximum Position Embeddings: 32,768, facilitating the processing of extended passages of text.
	- Hidden Size: 4,096, enabling the model to capture complex patterns and nuances in the data.
	- Num Attention Heads: 32, allowing for detailed attention to various aspects of the input.
	- Num Hidden Layers: 32, providing depth to the model's understanding and generation capabilities.

	Applications
	This model is adept at a wide range of natural language processing tasks, including but not limited to text generation, language translation, code synthesis, and more. Its unique blend of features from BigL-7B, Code-Mistral-7B, and mistral-7b-anthropic makes it particularly effective in scenarios requiring a deep understanding of both human and programming languages.

	---
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_adowu__autocodit)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|69.57\|
	\|AI2 Reasoning Challenge (25-Shot)\|66.38\|
	\|HellaSwag (10-Shot) \|84.82\|
	\|MMLU (5-Shot) \|65.09\|
	\|TruthfulQA (0-shot) \|59.95\|
	\|Winogrande (5-shot) \|80.51\|
	\|GSM8k (5-shot) \|60.65\|