nyunai
/

nyun-c2-llama3-50B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nyun-c2-llama3-50B / README.md

Arnav0400's picture

Update README.md

1b06645 verified 3 months ago

|

history blame contribute delete

No virus

1.35 kB

	---
	license: llama3
	---
	# 🔹 Key Highlights:

	- 29% Fewer Parameters: nyun-c2-llama3-50B comprises approximately 29% fewer parameters than the popular Llama-3-70B.
	- Comparable Performance: Despite having far fewer parameters, this model undergoes minimal performance degredation.
	- No Fine-Tuning Required: This model undergoes no fine-tuning, showcasing the raw potential of our optimization techniques.

	## Pipeline and Collaboration

	For insights into the pipeline and the list of methods used to optimize these models, check out our PruneGPT repository (https://github.com/nyunAI/PruneGPT).
	We invite companies and organizations interested in joining forces with us to release more such open-source variants to reach out at [email protected].

	### Model Performance

	\| Dataset \| nyun-c2-llama3-50B \| Meta-Llama3-70B \| Meta-Llama2-70B \| MBZUAI K2-65B \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| MMLU (5-shot) \| 78.4 \| 79.5 \| 69.7 \| 67.9 \|
	\| Winogrande (5-shot) \| 85.3 \| 83.1 \| 81.8 \| 77.0 \|
	\| BoolQ (0-shot) \| 83.9 \| 79.0 \| 73.1 \| 83.0 \|
	\| Hellaswag (10-shot) \| 85.4 \| 88.0 \| 86.9 \| 85.5 \|
	\| Arc Challenge (25-shot) \| 65.4 \| 68.8 \| 67.2 \| 64.8 \|
	\| GSM8K (5-shot) \| 64.7 \| 76.9 \| 52.6 \| 50.2 \|
	\| Average \| 77.2 \| 79.2 \| 71.9 \| 71.4 \|

	- Developed by: [Nyun AI](https://nyunai.com/)
	- Repository: [Github](https://github.com/nyunAI/PruneGPT)