|
--- |
|
library_name: transformers |
|
datasets: |
|
- fka/awesome-chatgpt-prompts |
|
base_model: |
|
- unsloth/mistral-7b-instruct-v0.2-bnb-4bit |
|
--- |
|
|
|
--- |
|
|
|
# Model Card for Mistral-7B Instruct v0.2 Finetuned Prompt Generator |
|
|
|
This model is fine-tuned for generating contextually relevant prompts for various scenarios and domains, helping users craft detailed and effective prompt instructions. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
This model is a fine-tuned version of [Mistral-7B-Instruct-v0.2-bnb-4bit] aimed at providing high-quality prompt generation across diverse topics. |
|
It excels in understanding input instructions and generating structured prompt that fit various creative, professional, and instructional needs. |
|
|
|
- **Developed by:** Abhinav Sarkar |
|
- **Shared by:** abhinavsarkar |
|
- **Model type:** Causal Language Model |
|
- **Languages:** English |
|
- **Finetuned from model:** Mistral-7B-Instruct-v0.2-bnb-4bit |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
This model is designed for generating context-specific prompts to assist with content creation, task descriptions, and crafting prompts for AI-based systems. |
|
It can be utilized to streamline processes in areas such as software development, customer interaction, and creative writing. |
|
|
|
### Downstream Use |
|
|
|
This model can be incorporated into tools or systems where high-quality prompt generation is essential, such as: |
|
- AI writing assistants |
|
- Educational tools |
|
- Chatbots requiring specialized responses or tailored prompts |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
Use the following peices of codes to start using the model: |
|
|
|
- PreRequisites |
|
```python |
|
!pip install -U bitsandbytes |
|
!pip install -U transformers |
|
``` |
|
|
|
- Loading the model and its tokenizer |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model = AutoModelForCausalLM.from_pretrained("abhinavsarkar/mistral-7b-instruct-v0.2-bb-4bit-finetuned-prompt-generator") |
|
tokenizer = AutoTokenizer.from_pretrained("abhinavsarkar/mistral-7b-instruct-v0.2-bb-4bit-finetuned-prompt-generator") |
|
``` |
|
|
|
- Inferencing the model |
|
```python |
|
prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. |
|
|
|
<|Instruction|> |
|
{} |
|
|
|
|<Input|> |
|
{} |
|
|
|
<|Response|> |
|
{} |
|
""" |
|
|
|
input_text = "Your Input text" |
|
|
|
inputs = tokenizer([ |
|
prompt.format( |
|
"You are a prompt engineer. Your task is to craft a prompt based on the given input that ensures the model behaves exactly as described by the provided word.", # instruction |
|
input_text, # input |
|
"", # output - leave this blank for generation! |
|
) |
|
], return_tensors="pt").to("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
with torch.no_grad(): |
|
output = model.generate(**inputs, max_new_tokens=512) |
|
|
|
response = tokenizer.decode(output[0], skip_special_tokens=True) |
|
|
|
start_token = "<|Response|>" |
|
end_token = "<|End|>" |
|
|
|
start_idx = response.find(start_token) + len(start_token) |
|
end_idx = response.find(end_token) |
|
|
|
final_response = response[start_idx:end_idx].strip() |
|
print(final_response) |
|
``` |
|
|
|
### Possible Errors and Solutions |
|
|
|
**Quantization Warnings**: |
|
If you receive warnings about unused arguments or quantization settings, ensure you have `bitsandbytes` installed: |
|
```python |
|
!pip install -U bitsandbytes |
|
``` |
|
|
|
**Tokenizer Issues**: |
|
If you encounter tokenizer-related errors, update the `transformers` library: |
|
```python |
|
!pip install -U transformers |
|
``` |
|
|
|
Restart the session after installing these packages. |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
The model was fine-tuned on [fka/awesome-chatgpt-prompts], a curated dataset focused on general-purpose prompt generation, ensuring broad applicability across a wide range of topics and tasks. |
|
|
|
### Training Procedure |
|
|
|
The model was fine-tuned using the Hugging Face Transformers library, Unsloth in a distributed environment(Google Collab, Kaggle), leveraging mixed-precision training for optimized performance. |
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** fp16 mixed precision |
|
- **Epochs:** 30 |
|
- **Batch size:** 2 |
|
- **Gradient accumulation steps:** 4 |
|
- **Learning rate:** 2e-4 |
|
|
|
## Technical Specifications |
|
|
|
### Model Architecture and Objective |
|
|
|
This model is based on Mistral-7B architecture, optimized for efficient inference using 4-bit quantization and fine-tuned for the task of causal language modeling. |
|
|
|
### Compute Infrastructure |
|
|
|
#### Hardware |
|
|
|
The fine-tuning was conducted on a setup involving two T4 GPUs. |
|
|
|
#### Software |
|
|
|
- **Framework**: PyTorch |
|
- **Libraries**: Hugging Face Transformers, Unsloth |
|
|
|
## More Information |
|
|
|
For further details or inquiries, please reach out via [LinkedIn](https://www.linkedin.com/in/abhinavsarkarrr/) or email at [email protected]. |
|
|
|
## Model Card Authors |
|
|
|
- Abhinav Sarkar |
|
|
|
## Model Card Contact |
|
|
|
- [email protected] |
|
|
|
--- |