base_model:
- happzy2633/qwen2.5-7b-ins-v3
- bunnycore/Qwen2.5-7B-Matrix
- bunnycore/Qwen2.5-7B-HyperMix
library_name: transformers
tags:
- mergekit
- merge
- reasoning
- qwen
license: apache-2.0
language:
- en
pipeline_tag: text-generation
model-index:
- name: Qwen2.5-7B-Anvita
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 64.33
name: strict accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 35.48
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 15.86
name: exact match
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 10.29
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 13.47
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 35.17
name: accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Qwen2.5-7B-Anvita
name: Open LLM Leaderboard
Qwen 2.5-7B Anvita
Overview
Anvita is a reasoning-oriented AI model designed to connect ideas and understand complex inputs. Derived from the Sanskrit word meaning "connected" or "understood," Anvita embodies intellectual depth and comprehension, making it an ideal choice for tasks requiring nuanced understanding and sophisticated reasoning.
Built using the DARE TIES merge method, Anvita integrates multiple pre-trained language models, including:
- Qwen2.5-7B-HyperMix
- bunnycore/Qwen2.5-7B-Matrix
- happzy2633/qwen2.5-7b-ins-v3
This combination optimizes Anvita for superior reasoning, dynamic conversations, and high-quality text generation.
Evaluation Results
Metric | Value |
---|---|
Avg. | 29.18 |
IFEval (0-Shot) | 64.8 |
BBH (3-Shot) | 35.48 |
MATH Level 5 (4-Shot) | 15.86 |
GPQA (0-Shot) | 10.29 |
MuSR (0-Shot) | 13.47 |
MMLU-PRO (5-Shot) | 35.17 |
Detailed results can be found here. Personal Benchmarks - check PERSONAL_BENCHMARK.md
For optimal reasoning performance, it is recommended to use BF16 precision and the Entropic Chain of Thought decoding method. This experimental decoder combines entropy and CoT decoding to enhance output quality.
Features
- Enhanced Reasoning: Optimized for multi-step reasoning across various domains.
- Long Sequence Handling: Capable of processing extended inputs without loss of context.
- Conversational Fluency: Engages in fluid, context-aware dialogues.
- Dense Knowledge Integration: Combines knowledge from multiple base models for comprehensive understanding.
Installation
To get started with Anvita, ensure you have the necessary dependencies installed. You can use the Transformers library for seamless integration.
pip install transformers rich
Quick Start
Here's a simple example to demonstrate how to use Anvita for generating responses with enhanced reasoning capabilities.
from transformers import AutoTokenizer, AutoModelForCausalLM
from rich.console import Console
from rich.markdown import Markdown
# Initialize console
console = Console()
# Load the tokenizer and model from the specified path
MODEL_PATH = "sethuiyer/Qwen2.5-7B-Anvita"
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
model = AutoModelForCausalLM.from_pretrained(MODEL_PATH).to("cuda")
QUESTION = "Is 9.11 greater than 9.8?"
messages = [
{"role": "user", "content": QUESTION}
]
# Generate the answer using Entropic Chain of Thought decoding
answer, score = cot_decode_speculative(model, tokenizer, messages, k=2, max_new_tokens=2058)
# Format the answer as markdown
markdown_answer = f"""
# **Answer:**
{answer}
**Score:** {score}
"""
# Display the answer in markdown format
console.print(Markdown(markdown_answer))
Configuration
The following YAML configuration was used to produce Anvita:
slices:
models:
- model: bunnycore/Qwen2.5-7B-Matrix
parameters:
weight: [0.25, 0.35, 0.45, 0.35, 0.25]
density: [0.1, 0.25, 0.5, 0.25, 0.1]
- model: bunnycore/Qwen2.5-7B-HyperMix
- model: happzy2633/qwen2.5-7b-ins-v3
parameters:
weight: [0.55, 0.45, 0.35, 0.45, 0.55]
density: [0.1, 0.25, 0.5, 0.25, 0.1]
merge_method: dare_ties
base_model: bunnycore/Qwen2.5-7B-HyperMix
parameters:
int8_mask: true
dtype: bfloat16