--- license: apache-2.0 tags: - merge - model_fusion - TIES - Llama3.1 - crypto - blockchain - coding_assistant - creative_writing - roleplaying - uncensored - latent_diffusion - long_context - agentic_AI - multi_domain - research - instruction-following - technical_reasoning - task_generalization - AI_tools - GPT base_model: - Chainbase-Labs/Theia-Llama-3.1-8B-v1 - EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO - aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored - DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst datasets: - CoinMarketCap - blockchain_projects - agentic_code_DPO libraries: transformers library_name: transformers --- # ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B **Architecture:** Llama 3.1 - 8B **Proposed Name:** Llama3.1-TheiaFire-DarkFusion-8B **Merge Method:** TIES **Merge Date:** 10/25/2024 **License:** Apache 2.0 --- ## Model Overview The **Llama3.1-TheiaFire-DarkFusion-8B** is a highly specialized fusion of four cutting-edge models, meticulously combined to provide an exceptional balance of technical reasoning, creativity, and uncensored freedom for a variety of use cases. Whether you need advanced coding assistance, blockchain insights, creative roleplaying, or general-purpose AI capabilities, this model delivers state-of-the-art results. This model was merged using the **TIES** merge method to ensure optimal blending of layer weights and parameter configurations, resulting in a model that excels in multiple domains. --- For optimal results, leave the system prompt blank within LMStudio. The tokenizer seems to struggle under system prompts. ## Model Components The following models were merged to create **Llama3.1-TheiaFire-DarkFusion-8B**: 1. **[Theia-Llama-3.1-8B-v1](https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1)** - **Purpose:** Balances technical vision and crypto capabilities. - **Training Focus:** This model specializes in blockchain data and was trained on a large dataset of crypto whitepapers, research reports, and market data. - **Unique Feature:** Fine-tuned using LoRA for optimized crypto-specific performance. 2. **[EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co/EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)** - **Purpose:** Specialized in agentic reasoning and advanced coding tasks. - **Unique Feature:** This model is equipped with a 128K context window and comes with built-in tools for ReAct, calculator, search, and more. 3. **[aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)** - **Purpose:** Provides uncensored, creativity-driven responses ideal for writing, role-playing, and in-depth conversations. - **Unique Feature:** Uncensored nature allows for open exploration of creative writing and darker, more complex roleplay scenarios. 4. **[DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co/DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)** - **Purpose:** Enhances performance with latent diffusion model blending. - **Unique Feature:** This model builds upon Llama-3.1’s foundation and improves unseen task generalization with latent diffusion. --- ## Model Specifications ### Merge Configuration ```yaml # Llama3.1-TheiaFire-DarkFusion-8B Merge Configuration models: - model: Chainbase-Labs/Theia-Llama-3.1-8B-v1 parameters: density: 0.4 # Balancing technical vision and crypto capabilities weight: 0.3 - model: EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO parameters: density: 0.6 # Giving priority to code-based reasoning and agentic capabilities weight: 0.4 - model: aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored parameters: density: 0.5 # Focus on creativity and uncensored roleplay flexibility weight: 0.2 - model: DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst parameters: density: 0.5 # Blending latent diffusion capabilities for unseen tasks weight: 0.1 merge_method: ties base_model: Theia-Llama-3.1-8B-v1 dtype: bfloat16 parameters: normalize: true out_dtype: float16 ``` --- ## Intended Use Cases 1. **Crypto Analysis & Blockchain Projects** - Leverages data from CoinMarketCap and research reports for in-depth analysis of blockchain projects and crypto markets. - Ideal for creating blockchain-related content or automating crypto data analysis. 2. **Advanced Coding Assistant** - Built-in support for agentic behavior such as reasoning and action, making it perfect for AI-driven coding assistance. - Handles large-scale coding projects with tools like search and calculator integration. 3. **Creative Writing & Roleplay** - **Uncensored output** allows for rich, expressive writing ideal for novels, creative pieces, or roleplay scenarios. - Capable of producing nuanced, emotionally complex character responses in roleplaying games or interactive storytelling. 4. **Unseen Task Generalization** - With the latent diffusion capabilities, this model can handle unseen tasks by learning weight distributions in an adaptive manner, improving performance on novel datasets or tasks. --- ## Performance - The model has shown significant improvements in **multi-domain reasoning**, **code generation**, and **unconstrained creative output**. - **Enhanced task generalization** due to latent diffusion model blending techniques. --- ## Model Capabilities - **Context Window**: 128K (capable of handling long-form tasks like novel writing and in-depth research). - **Agentic Tools**: Built-in tools like search and calculator. - **Safety**: While uncensored, responsible prompting is encouraged to ensure the best user experience and ethical usage. --- ## Usage This model can be used in popular AI libraries like **Transformers** and **Langchain**. Below is a basic setup using **Transformers**: ### Example Code ```python import transformers import torch model_id = "Llama3.1-TheiaFire-DarkFusion-8B" pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto", ) messages = [ {"role": "system", "content": "You are an AI assistant skilled in coding and creative writing."}, {"role": "user", "content": "Please write me a Python function to compute the factorial of a number."} ] outputs = pipeline(messages, max_new_tokens=256) print(outputs[0]["generated_text"][-1]) ``` --- ## Limitations - **Uncensored Output**: While this model offers creative freedom, it may produce content that could be considered inappropriate or unsuitable for certain contexts. - **Bias**: As with all language models, this one may reflect inherent biases in the training data. Users are encouraged to review and edit the outputs before use. --- ## Acknowledgments This model is a collective effort, combining the groundbreaking work from: - **Chainbase Labs** (for Theia-Llama) - **EpistemeAI** (for Fireball Meta-Llama) - **Aifeifei798** (for DarkIdol) - **DeepAutoAI** (for LDM Soup) Special thanks to the open-source community and the developers who contributed to the training and fine-tuning of these models. ---