|
--- |
|
datasets: |
|
- airesearch/WangchanX-Legal-ThaiCCL-RAG |
|
language: |
|
- th |
|
pipeline_tag: sentence-similarity |
|
tags: |
|
- legal |
|
- RAG |
|
widget: [] |
|
license: mit |
|
base_model: |
|
- BAAI/bge-m3 |
|
--- |
|
|
|
## WangchanX-Legal-ThaiCCL-Retriever |
|
|
|
This model card describes WangchanX-Legal-ThaiCCL-Retriever, a retriever model fine-tuned from the bge-m3 model on the WangchanX-Legal-ThaiCCL-RAG dataset. It is designed to retrieve relevant legal text sections in response to legal questions posed in Thai, specifically focusing on Corporate and Commercial Law (CCL). |
|
|
|
**Model Details:** |
|
|
|
* **Base Model:** [bge-m3](https://huggingface.co/BAAI/bge-m3) |
|
* **Fine-tuned Dataset:** [WangchanX-Legal-ThaiCCL-RAG dataset](https://huggingface.co/datasets/airesearch/WangchanX-Legal-ThaiCCL-RAG) |
|
* **Language:** Thai |
|
* **Maximum Sequence Length:** 8192 tokens |
|
* **Output Dimensionality:** 1024 tokens |
|
* **License:** MIT |
|
|
|
**WangchanX-Legal-ThaiCCL-RAG** |
|
|
|
This dataset focuses on supporting Thai legal question-answering systems using Retrieval-Augmented Generation (RAG), focusing on Corporate and Commercial Law. |
|
|
|
**Intended Use Cases:** |
|
This model is designed for use as a retriever model within a larger RAG pipeline. |
|
* **Legal Question Answering:** Serving as a core component in a larger question-answering system that provides answers to user queries about Thai law. |
|
* **Legal Information Retrieval:** Enabling efficient retrieval of information from Thai legal texts. |
|
|
|
<!-- ## Usage |
|
|
|
This model is designed for use as a retriever model within a larger RAG pipeline. Given a legal question in Thai, it will retrieve the most relevant sections from the Thai CCL corpus. You can integrate this model into your application using the Hugging Face Transformers library. |
|
--> |
|
<!-- |
|
### Direct Usage (Transformers) |
|
|
|
<details><summary>Click to see the direct usage in Transformers</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Downstream Usage (Sentence Transformers) |
|
|
|
You can finetune this model on your own dataset. |
|
|
|
<details><summary>Click to expand</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Out-of-Scope Use |
|
|
|
*List how the model may foreseeably be misused and address what users ought not to do with the model.* |
|
--> |
|
|
|
<!-- |
|
## Bias, Risks and Limitations |
|
|
|
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
|
--> |
|
|
|
<!-- |
|
### Recommendations |
|
|
|
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
|
--> |
|
<!-- |
|
## Citation |
|
|
|
### BibTeX |
|
--> |