deepnight-research/lil-c3po
Model Details:
lil-c3po is an open-source large language model (LLM) resulting from the linear merge of two distinct fine-tuned Mistral-7B models, internally referred to as c3-1 and c3-2. These models, developed in-house, bring together unique characteristics to enhance performance and utility.
Model Architecture:
lil-c3po inherits its architecture from the combined c3-1 and c3-2 models, incorporating features such as Grouped-Query Attention, Sliding-Window Attention, and Byte-fallback BPE tokenizer. This fusion aims to capitalize on the strengths of both models for improved language understanding and generation.
Training Details:
- The first model, internally referred to as c3-1, is a 7B parameter Large Language Model fine-tuned on the Intel Gaudi 2 processor. It utilizes the Direct Performance Optimization (DPO) method and is designed to excel in various language-related tasks.
- The second model, denoted as c3-2, is an instruct fine-tuned version of Mistral-7B. Its architecture features improvements in instruct fine-tuning, contributing to enhanced language understanding in instructional contexts.
License:
lil-c3po is released under the MIT license, fostering open-source collaboration and innovation.
Intended Use:
This merged model is suitable for a broad range of language-related tasks, inheriting the capabilities of the fine-tuned c3-1 and c3-2 models. Users interested in language tasks can leverage lil-c3po's capabilities.
Out-of-Scope Uses:
While lil-c3po is versatile, it is important to note that, in most cases, fine-tuning may be necessary for specific tasks. Additionally, the model should not be used to intentionally create hostile or alienating environments for people.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 68.03 |
AI2 Reasoning Challenge (25-Shot) | 65.02 |
HellaSwag (10-Shot) | 84.45 |
MMLU (5-Shot) | 62.36 |
TruthfulQA (0-shot) | 68.73 |
Winogrande (5-shot) | 79.16 |
GSM8k (5-shot) | 48.45 |
- Downloads last month
- 11
Evaluation results
- normalized accuracy on AI2 Reasoning Challenge (25-Shot)test set Open LLM Leaderboard65.020
- normalized accuracy on HellaSwag (10-Shot)validation set Open LLM Leaderboard84.450
- accuracy on MMLU (5-Shot)test set Open LLM Leaderboard62.360
- mc2 on TruthfulQA (0-shot)validation set Open LLM Leaderboard68.730
- accuracy on Winogrande (5-shot)validation set Open LLM Leaderboard79.160
- accuracy on GSM8k (5-shot)test set Open LLM Leaderboard48.450