Text Generation
Transformers
Safetensors
llama
Merge
model_fusion
TIES
Llama3.1
crypto
blockchain
coding_assistant
creative_writing
roleplaying
uncensored
latent_diffusion
long_context
agentic_AI
multi_domain
research
instruction-following
technical_reasoning
task_generalization
AI_tools
GPT
conversational
text-generation-inference
Inference Endpoints
File size: 7,301 Bytes
1b7b089 034c330 1b7b089 034c330 1b7b089 034c330 1b7b089 034c330 3526d83 034c330 1b7b089 034c330 1b7b089 034c330 1b7b089 034c330 1b7b089 034c330 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
---
license: apache-2.0
tags:
- merge
- model_fusion
- TIES
- Llama3.1
- crypto
- blockchain
- coding_assistant
- creative_writing
- roleplaying
- uncensored
- latent_diffusion
- long_context
- agentic_AI
- multi_domain
- research
- instruction-following
- technical_reasoning
- task_generalization
- AI_tools
- GPT
base_model:
- Chainbase-Labs/Theia-Llama-3.1-8B-v1
- EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
- aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
- DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
datasets:
- CoinMarketCap
- blockchain_projects
- agentic_code_DPO
libraries: transformers
library_name: transformers
---
# ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B
**Architecture:** Llama 3.1 - 8B
**Proposed Name:** Llama3.1-TheiaFire-DarkFusion-8B
**Merge Method:** TIES
**Merge Date:** 10/25/2024
**License:** Apache 2.0
---
## Model Overview
The **Llama3.1-TheiaFire-DarkFusion-8B** is a highly specialized fusion of four cutting-edge models, meticulously combined to provide an exceptional balance of technical reasoning, creativity, and uncensored freedom for a variety of use cases. Whether you need advanced coding assistance, blockchain insights, creative roleplaying, or general-purpose AI capabilities, this model delivers state-of-the-art results.
This model was merged using the **TIES** merge method to ensure optimal blending of layer weights and parameter configurations, resulting in a model that excels in multiple domains.
---
For optimal results, leave the system prompt blank within LMStudio. The tokenizer seems to struggle under system prompts.
## Model Components
The following models were merged to create **Llama3.1-TheiaFire-DarkFusion-8B**:
1. **[Theia-Llama-3.1-8B-v1](https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1)**
- **Purpose:** Balances technical vision and crypto capabilities.
- **Training Focus:** This model specializes in blockchain data and was trained on a large dataset of crypto whitepapers, research reports, and market data.
- **Unique Feature:** Fine-tuned using LoRA for optimized crypto-specific performance.
2. **[EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co/EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)**
- **Purpose:** Specialized in agentic reasoning and advanced coding tasks.
- **Unique Feature:** This model is equipped with a 128K context window and comes with built-in tools for ReAct, calculator, search, and more.
3. **[aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)**
- **Purpose:** Provides uncensored, creativity-driven responses ideal for writing, role-playing, and in-depth conversations.
- **Unique Feature:** Uncensored nature allows for open exploration of creative writing and darker, more complex roleplay scenarios.
4. **[DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co/DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)**
- **Purpose:** Enhances performance with latent diffusion model blending.
- **Unique Feature:** This model builds upon Llama-3.1’s foundation and improves unseen task generalization with latent diffusion.
---
## Model Specifications
### Merge Configuration
```yaml
# Llama3.1-TheiaFire-DarkFusion-8B Merge Configuration
models:
- model: Chainbase-Labs/Theia-Llama-3.1-8B-v1
parameters:
density: 0.4 # Balancing technical vision and crypto capabilities
weight: 0.3
- model: EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
parameters:
density: 0.6 # Giving priority to code-based reasoning and agentic capabilities
weight: 0.4
- model: aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
parameters:
density: 0.5 # Focus on creativity and uncensored roleplay flexibility
weight: 0.2
- model: DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
parameters:
density: 0.5 # Blending latent diffusion capabilities for unseen tasks
weight: 0.1
merge_method: ties
base_model: Theia-Llama-3.1-8B-v1
dtype: bfloat16
parameters:
normalize: true
out_dtype: float16
```
---
## Intended Use Cases
1. **Crypto Analysis & Blockchain Projects**
- Leverages data from CoinMarketCap and research reports for in-depth analysis of blockchain projects and crypto markets.
- Ideal for creating blockchain-related content or automating crypto data analysis.
2. **Advanced Coding Assistant**
- Built-in support for agentic behavior such as reasoning and action, making it perfect for AI-driven coding assistance.
- Handles large-scale coding projects with tools like search and calculator integration.
3. **Creative Writing & Roleplay**
- **Uncensored output** allows for rich, expressive writing ideal for novels, creative pieces, or roleplay scenarios.
- Capable of producing nuanced, emotionally complex character responses in roleplaying games or interactive storytelling.
4. **Unseen Task Generalization**
- With the latent diffusion capabilities, this model can handle unseen tasks by learning weight distributions in an adaptive manner, improving performance on novel datasets or tasks.
---
## Performance
- The model has shown significant improvements in **multi-domain reasoning**, **code generation**, and **unconstrained creative output**.
- **Enhanced task generalization** due to latent diffusion model blending techniques.
---
## Model Capabilities
- **Context Window**: 128K (capable of handling long-form tasks like novel writing and in-depth research).
- **Agentic Tools**: Built-in tools like search and calculator.
- **Safety**: While uncensored, responsible prompting is encouraged to ensure the best user experience and ethical usage.
---
## Usage
This model can be used in popular AI libraries like **Transformers** and **Langchain**. Below is a basic setup using **Transformers**:
### Example Code
```python
import transformers
import torch
model_id = "Llama3.1-TheiaFire-DarkFusion-8B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "You are an AI assistant skilled in coding and creative writing."},
{"role": "user", "content": "Please write me a Python function to compute the factorial of a number."}
]
outputs = pipeline(messages, max_new_tokens=256)
print(outputs[0]["generated_text"][-1])
```
---
## Limitations
- **Uncensored Output**: While this model offers creative freedom, it may produce content that could be considered inappropriate or unsuitable for certain contexts.
- **Bias**: As with all language models, this one may reflect inherent biases in the training data. Users are encouraged to review and edit the outputs before use.
---
## Acknowledgments
This model is a collective effort, combining the groundbreaking work from:
- **Chainbase Labs** (for Theia-Llama)
- **EpistemeAI** (for Fireball Meta-Llama)
- **Aifeifei798** (for DarkIdol)
- **DeepAutoAI** (for LDM Soup)
Special thanks to the open-source community and the developers who contributed to the training and fine-tuning of these models.
--- |