Bitext
commited on
Commit
•
76221c9
1
Parent(s):
19620f2
Create README.md
Browse filesUpdate README.md detailing the fine-tuned model's training data, architecture, and intended use.
README.md
ADDED
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- axolotl
|
5 |
+
- generated_from_trainer
|
6 |
+
- text-generation-inference
|
7 |
+
base_model: mistralai/Mistral-7B-Instruct-v0.2
|
8 |
+
model_type: mistral
|
9 |
+
pipeline_tag: text-generation
|
10 |
+
model-index:
|
11 |
+
- name: Mistral-7B-Retail-v2
|
12 |
+
results: []
|
13 |
+
---
|
14 |
+
|
15 |
+
# Mistral-7B-Retail-v2
|
16 |
+
|
17 |
+
## Model Description
|
18 |
+
|
19 |
+
This model, named "Mistral-7B-Retail-v2," is a specially adjusted version of the [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2). It is fine-tuned to manage questions and provide answers related to retail services.
|
20 |
+
|
21 |
+
## Intended Use
|
22 |
+
|
23 |
+
- **Recommended applications**: This model is perfect for use in retail environments. It can be integrated into customer service chatbots or help systems to provide real-time responses to common retail-related inquiries.
|
24 |
+
- **Out-of-scope**: This model should not be used for medical, legal, or any serious safety-related purposes.
|
25 |
+
|
26 |
+
## Usage Example
|
27 |
+
|
28 |
+
```python
|
29 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
30 |
+
|
31 |
+
model = AutoModelForCausalLM.from_pretrained("bitext-llm/Mistral-7B-Retail-v2")
|
32 |
+
tokenizer = AutoTokenizer.from_pretrained("bitext-llm/Mistral-7B-Retail-v2")
|
33 |
+
|
34 |
+
inputs = tokenizer("<s>[INST] How can I return a purchased item? [/INST]", return_tensors="pt")
|
35 |
+
outputs = model.generate(inputs['input_ids'], max_length=50)
|
36 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
37 |
+
```
|
38 |
+
|
39 |
+
## Model Architecture
|
40 |
+
|
41 |
+
The "Mistral-7B-Retail-v2" uses the `MistralForCausalLM` structure with a `LlamaTokenizer`. It maintains the setup of the base model but is enhanced to better respond to retail-related questions.
|
42 |
+
|
43 |
+
## Training Data
|
44 |
+
|
45 |
+
This model was trained with a dataset specifically designed for retail-related question and answer interactions. The dataset encompasses a comprehensive range of retail intents, ensuring the model is trained to handle diverse customer inquiries and scenarios. It includes 46 distinct intents such as `add_product`, `availability_in_store`, `cancel_order`, `pay`, `refund_policy`, `track_order`, `use_app`, and many more, reflecting common retail transactions and customer service interactions. Each intent contains 1000 examples, which helps in creating responses across various retail situations.
|
46 |
+
|
47 |
+
This extensive training dataset ensures that the model can understand and respond to a wide array of retail-related queries, providing support in customer service applications. The dataset follows a structured approach, similar to other datasets published on Hugging Face, but is specifically tailored to cater to the retail sector.
|
48 |
+
|
49 |
+
## Training Procedure
|
50 |
+
|
51 |
+
### Hyperparameters
|
52 |
+
|
53 |
+
- **Optimizer**: AdamW
|
54 |
+
- **Learning Rate**: 0.0002
|
55 |
+
- **Epochs**: 1
|
56 |
+
- **Batch Size**: 8
|
57 |
+
- **Gradient Accumulation Steps**: 4
|
58 |
+
- **Maximum Sequence Length**: 1024 tokens
|
59 |
+
|
60 |
+
### Environment
|
61 |
+
|
62 |
+
- **Transformers Version**: 4.40.0.dev0
|
63 |
+
- **Framework**: PyTorch 2.2.1+cu121
|
64 |
+
- **Tokenizers**: Tokenizers 0.15.0
|
65 |
+
|
66 |
+
## Limitations and Bias
|
67 |
+
|
68 |
+
- The model is specifically trained for retail and may not give accurate results outside this domain.
|
69 |
+
- There might be biases in the training data, so it is important to review the model's responses carefully.
|
70 |
+
|
71 |
+
## Ethical Considerations
|
72 |
+
|
73 |
+
It is crucial to use this model responsibly, especially in scenarios involving personal customer interactions. Care should be taken to ensure that the model's use does not replace necessary human judgment in sensitive situations.
|
74 |
+
|
75 |
+
## Acknowledgments
|
76 |
+
|
77 |
+
This model was developed by Bitext and trained using their infrastructure.
|
78 |
+
|
79 |
+
## License
|
80 |
+
|
81 |
+
"Mistral-7B-Retail-v2" is licensed under Apache License 2.0 by Bitext Innovations International, Inc. This license allows anyone to use, modify, and distribute the model freely while ensuring Bitext is credited for their work.
|
82 |
+
|
83 |
+
### Key Points of the Apache 2.0 License
|
84 |
+
|
85 |
+
- **Permissibility**: Free use, modification, and distribution.
|
86 |
+
- **Attribution**: Credit must be given to Bitext, aligning with the copyright notices and the license.
|
87 |
+
- **No Warranty**: The model is provided without any guarantees.
|
88 |
+
|
89 |
+
For complete details, visit [Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0).
|