Pawitsapak's picture
Update README.md
2a03371 verified
---
datasets:
- airesearch/WangchanX-Legal-ThaiCCL-RAG
language:
- th
pipeline_tag: sentence-similarity
tags:
- legal
- RAG
widget: []
license: mit
base_model:
- BAAI/bge-m3
---
## WangchanX-Legal-ThaiCCL-Retriever
This model card describes WangchanX-Legal-ThaiCCL-Retriever, a retriever model fine-tuned from the bge-m3 model on the WangchanX-Legal-ThaiCCL-RAG dataset. It is designed to retrieve relevant legal text sections in response to legal questions posed in Thai, specifically focusing on Corporate and Commercial Law (CCL).
**Model Details:**
* **Base Model:** [bge-m3](https://huggingface.co/BAAI/bge-m3)
* **Fine-tuned Dataset:** [WangchanX-Legal-ThaiCCL-RAG dataset](https://huggingface.co/datasets/airesearch/WangchanX-Legal-ThaiCCL-RAG)
* **Language:** Thai
* **Maximum Sequence Length:** 8192 tokens
* **Output Dimensionality:** 1024 tokens
* **License:** MIT
**WangchanX-Legal-ThaiCCL-RAG**
This dataset focuses on supporting Thai legal question-answering systems using Retrieval-Augmented Generation (RAG), focusing on Corporate and Commercial Law.
**Intended Use Cases:**
This model is designed for use as a retriever model within a larger RAG pipeline.
* **Legal Question Answering:** Serving as a core component in a larger question-answering system that provides answers to user queries about Thai law.
* **Legal Information Retrieval:** Enabling efficient retrieval of information from Thai legal texts.
<!-- ## Usage
This model is designed for use as a retriever model within a larger RAG pipeline. Given a legal question in Thai, it will retrieve the most relevant sections from the Thai CCL corpus. You can integrate this model into your application using the Hugging Face Transformers library.
-->
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
<!--
## Citation
### BibTeX
-->