shuttleai
/

shuttle-2.5-mini-GPTQ-Int4

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

xtristan commited on Jul 27

Commit

31dd7f5

•

1 Parent(s): 264c792

Create README.md

Files changed (1) hide show

README.md +71 -0

README.md ADDED Viewed

	@@ -0,0 +1,71 @@

+---
+license: apache-2.0
+---
+<p style="font-size:20px;" align="center">
+<div style="width: 100%; height: 300px; overflow: hidden; border-radius: 15px; margin: auto; position: relative;">
+    <img
+        src="https://cdn.shuttleai.app/thumbnail.png"
+        alt="ShuttleAI Thumbnail"
+        style="width: 100%; height: auto; display: block; margin: auto; position: absolute; top: 50%; left: 50%; transform: translate(-50%, -50%); object-fit: cover;">
+</div>
+<p align="center">
+    💻 <a href="https://shuttleai.app/" target="_blank">Use via API</a>
+</p>
+## shuttle-2.5-mini-GPTQ-Int4 [2024/07/26]
+We are excited to introduce Shuttle-2.5-mini, our next-generation state-of-the-art language model designed to excel in complex chat, multilingual communication, reasoning, and agent tasks.
+- **Shuttle-2.5-mini** is a fine-tuned version of [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407), emulating the writing style of Claude 3 models and thoroughly trained on role-playing data.
+## Model Details
+* **Model Name**: Shuttle-2.5-mini
+* **Developed by**: ShuttleAI Inc.
+* **Base Model**: [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407)
+* **Parameters**: 13B
+* **Language(s)**: Multilingual
+* **Repository**: [https://huggingface.co/shuttleai](https://huggingface.co/shuttleai)
+* **Fine-Tuned Model**: [https://huggingface.co/shuttleai/shuttle-2.5-mini](https://huggingface.co/shuttleai/shuttle-2.5-mini)
+* **Paper**: Shuttle-2.5-mini (Upcoming)
+* **License**: Apache 2.0
+## Base Model Architecture
+**Mistral Nemo** is a transformer model with the following architecture choices:
+- **Layers**: 40
+- **Dimension**: 5,120
+- **Head Dimension**: 128
+- **Hidden Dimension**: 14,436
+- **Activation Function**: SwiGLU
+- **Number of Heads**: 32
+- **Number of kv-heads**: 8 (GQA)
+- **Vocabulary Size**: 2^17 (approximately 128k)
+- **Rotary Embeddings**: Theta = 1M
+### Key Features
+- Released under the Apache 2 License
+- Trained with a 128k context window
+- Pretrained on a large proportion of multilingual and code data
+- Finetuned to emulate the prose quality of Claude 3 models and extensively on role play data
+## Fine-Tuning Details
+- **Training Setup**: Trained on 4x A100 GPU for 2 epochs, totaling 24 hours.
+## Prompting
+Shuttle-2.5-mini uses ChatML as its prompting format:
+```
+<|im_start|>system
+You are a pirate! Yardy harr harr!<|im_end|>
+<|im_start|>user
+Where are you currently!<|im_end|>
+<|im_start|>assistant
+Look ahoy ye scallywag! We're on the high seas!<|im_end|>
+```