Update README.md
Browse files
README.md
CHANGED
@@ -15,31 +15,33 @@ tags:
|
|
15 |
|
16 |
AI2sql is a state-of-the-art LLM for converting natural language questions to SQL queries.
|
17 |
|
18 |
-
## Model Details
|
19 |
|
20 |
-
|
|
|
21 |
|
22 |
-
|
|
|
23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
|
|
|
|
25 |
|
|
|
|
|
26 |
|
27 |
-
|
28 |
-
|
29 |
-
- **Shared by [optional]:** [More Information Needed]
|
30 |
-
- **Model type:** [More Information Needed]
|
31 |
-
- **Language(s) (NLP):** [More Information Needed]
|
32 |
-
- **License:** [More Information Needed]
|
33 |
-
- **Finetuned from model [optional]:** [More Information Needed]
|
34 |
-
|
35 |
-
### Model Sources [optional]
|
36 |
-
|
37 |
-
<!-- Provide the basic links for the model. -->
|
38 |
-
|
39 |
-
- **Repository:** [More Information Needed]
|
40 |
-
- **Paper [optional]:** [More Information Needed]
|
41 |
-
- **Demo [optional]:** [More Information Needed]
|
42 |
|
|
|
|
|
43 |
|
44 |
|
45 |
## Training procedure
|
|
|
15 |
|
16 |
AI2sql is a state-of-the-art LLM for converting natural language questions to SQL queries.
|
17 |
|
|
|
18 |
|
19 |
+
## Model Description
|
20 |
+
This model card presents the finetuning of the Mistral-7b model using the PEFT library and bitsandbytes for loading large models in 4-bit. The notebook demonstrates finetuning with Low Rank Adapters (LoRA), allowing only the adapters to be finetuned instead of the entire model. The process is designed for ease of use with Google Colab and is applicable for models supporting device_map.
|
21 |
|
22 |
+
## Training Data
|
23 |
+
The finetuning involves a dataset on finance from Wikisql, using 10% of the data to showcase the process. The data is prepared in a prompt format for better comprehension by the model.
|
24 |
|
25 |
+
## Training Procedure
|
26 |
+
The training involves several steps:
|
27 |
+
1. **Installing Necessary Packages:** Installation of required libraries from their source.
|
28 |
+
2. **Model Loading:** Using QLoRA quantization to load the model, reducing memory usage.
|
29 |
+
3. **Dataset Preparation:** Tokenizing and splitting the dataset for training and testing.
|
30 |
+
4. **Applying LoRA:** Utilizing PEFT for applying low-rank adapters to the model.
|
31 |
+
5. **Running the Training:** Implementing training with specific arguments, showcasing the process with a demo setup.
|
32 |
+
6. **Evaluating the Model:** Qualitative evaluation through inferences.
|
33 |
|
34 |
+
## How to Use
|
35 |
+
The trained adapters can be shared on the Hugging Face Hub for easy loading. Users can directly load the adapters from the Hub and employ the model for tasks such as generating SQL queries.
|
36 |
|
37 |
+
## Limitations and Bias
|
38 |
+
This finetuning process is specific to the Mistral-7b model and may not generalize to other models. The focus on finance data might limit the model's applicability to other domains.
|
39 |
|
40 |
+
## Ethical Considerations
|
41 |
+
Users should be aware of potential biases in the training data, especially given its focus on finance, and should consider this when applying the model to real-world scenarios.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
|
43 |
+
## Acknowledgements
|
44 |
+
This work utilizes resources and tools from Hugging Face, including the PEFT library, bitsandbytes, and other associated libraries. The process is designed to be accessible and implementable using Google Colab.
|
45 |
|
46 |
|
47 |
## Training procedure
|