Llama-3.1-Legal-ThaiCCL-8B

Llama-3.1-Legal-ThaiCCL-8B is a large language model built upon Llama-3.1-8B, designed to answer Thai legal questions. It is full finetuned on the WangchanX Thai Legal dataset using the WangchanX Finetuning pipeline. The model is intended to be used with a supporting Retrieval-Augmented Generation (RAG) system which queries relevant supporting legal documents for the model to reference when responding to the questions.

Model description

Base model: Meta Llama 3.1 8B
Training Repository: WangchanX Finetuning Pipeline
Training Dataset: WangchanX Thai Legal dataset
License: Meta's Llama 3.1 Community License Agreement

Model Usage

from transformers import pipeline
import torch

EN_QA_TEMPLATE = "Given the user's query in the context of Thai legal matters, the RAG system retrieves the top_n related documents. From these documents, it's crucial to identify and utilize only the most relevant ones to craft an accurate and informative response.Context information is below.\n\n---------------------\nContext: Thai legal domain\nQuery: {query_str}\nRetrieved Documents: {context_str}\n---------------------\n\n Using the provided context information and the list of retrieved documents, you will focus on selecting the documents that are most relevant to the user's query. This selection process involves evaluating the content of each document for its pertinency to the query, ensuring that the response is based on accurate and contextually appropriate information.Based on the selected documents, you will synthesize a response that addresses the user's query, drawing directly from the content of these documents to provide a precise, legally informed answer.You must answer in Thai.\nAnswer:"

EN_SYSTEM_PROMPT_STR = """You are a legal assistant named Sommai (สมหมาย in Thai). You provide legal advice in a friendly, clear, and approachable manner. When answering questions, you reference the relevant law sections, including the name of the act or code they are from. You explain what these sections entail, including any associated punishments, fees, or obligations. Your tone is polite yet informal, making users feel comfortable, like consulting a trusted friend. If a question falls outside your knowledge, you must respond with the exact phrase: 'สมหมายไม่สามารถตอบคำถามนี้ได้ครับ'. You avoid making up information and guide users based on accurate legal references relevant to their situation. Where applicable, you provide practical advice, such as preparing documents, seeking medical attention, or contacting authorities. If asked about past Supreme Court judgments, you must state that you do not have information on those judgments at this time."""

query = "การร้องขอให้ศาลสั่งให้บุคคลเป็นคนไร้ความสามารถมีหลักเกณฑ์การพิจารณาอย่างไร"

context = """ประมวลกฎหมายแพ่งและพาณิชย์ มาตรา 33 ในคดีที่มีการร้องขอให้ศาลสั่งให้บุคคลใดเป็นคนไร้ความสามารถเพราะวิกลจริต ถ้าทางพิจารณาได้ความว่าบุคคลนั้นไม่วิกลจริต แต่มีจิตฟั่นเฟือนไม่สมประกอบ เมื่อศาลเห็นสมควรหรือเมื่อมีคำขอของคู่ความหรือของบุคคลตามที่ระบุไว้ในมาตรา 28 ศาลอาจสั่งให้บุคคลนั้นเป็นคนเสมือนไร้ความสามารถก็ได้ หรือในคดีที่มีการร้องขอให้ศาลสั่งให้บุคคลใดเป็นคนเสมือนไร้ความสามารถเพราะมีจิตฟั่นเฟือนไม่สมประกอบ ถ้าทางพิจารณาได้ความว่าบุคคลนั้นวิกลจริต เมื่อมีคำขอของคู่ความหรือของบุคคลตามที่ระบุไว้ในมาตรา 28 ศาลอาจสั่งให้บุคคลนั้นเป็นคนไร้ความสามารถก็ได้"""


model_id = "airesearch/LLaMa3.1-8B-Legal-ThaiCCL-Combine"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

sample = [
    {"role": "system", "content": SYSTEM_PROMPT_STR},
    {"role": "user", "content": QA_template.format(context_str=context, query_str=query)},
]

prompt = pipeline.tokenizer.apply_chat_template(sample, 
                                                tokenize=False, 
                                                add_generation_prompt=True)

outputs = pipeline(
    prompt,
    max_new_tokens = 512,
    eos_token_id = terminators,
    do_sample = True,
    temperature = 0.6,
    top_p = 0.9
)

print(outputs[0]["generated_text"][-1])

Training Data

The model is trained on the WangchanX Legal ThaiCCL RAG dataset, which is a Thai legal question-answering dataset created using a RAG system to query relevant supporting legal datasets based on a question for the LLM to reference in its answer. For more information on how the datasets was created please refer to this blog.

To emulate a real world use case, during training we incorporated both the positive and negative context (if available) into the prompt. We found that this resulted in a model that is more robust towards cases that the RAG system also passes in irrelevant contexts mixed with the correct context to reference (refer to the evaluation section for results).

Prompt Format

We recommend using the same chat template (system prompt and question template of context, query, and retreived documents) when using the provided weights, since the model was trained with the specific system prompt and question template. Example input prompt:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a legal assistant named Sommai (สมหมาย in Thai), you provide legal advice to users in a friendly and understandable manner. When answering questions, you specifically reference the law sections relevant to the query, including the name of the act or code they originated from, an explanation of what those sections entail, and any associated punishments or fees. Your tone is approachable and informal yet polite, making users feel as if they are seeking advice from a friend. If a question arises that does not match the information you possess, you must acknowledge your current limitations by stating this exactly sentence: 'สมหมายไม่สามารถตอบคำถามนี้ได้ครับ'. You will not fabricate information but rather guide users based on actual law sections relevant to their situation. Additionally, you offer practical advice on next steps, such as gathering required documents, seeking medical attention, or visiting a police station, as applicable. If inquired about past Supreme Court judgments, you must reply that you do not have information on those judgments yet.<|eot_id|>
<|start_header_id|>user<|end_header_id|>

Given the user's query in the context of Thai legal matters, the RAG system retrieves the top_n related documents. From these documents, it's crucial to identify and utilize only the most relevant ones to craft an accurate and informative response.

Context information is below.
---------------------
Context: Thai legal domain
Query: {question}
Retreived Documents: {retreived legal documents}
---------------------

Using the provided context information and the list of retrieved documents, you will focus on selecting the documents that are most relevant to the user's query. This selection process involves evaluating the content of each document for its pertinency to the query, ensuring that the response is based on accurate and contextually appropriate information.
Based on the selected documents, you will synthesize a response that addresses the user's query, drawing directly from the content of these documents to provide a precise, legally informed answer.
You must answer in Thai.
Answer:
<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>

Here is a Python code snippet on how to apply the chat template with the provided system prompt and question template on the WangchanX Legal Thai CCL dataset:

EN_QA_TEMPLATE = "Given the user's query in the context of Thai legal matters, the RAG system retrieves the top_n related documents. From these documents, it's crucial to identify and utilize only the most relevant ones to craft an accurate and informative response.Context information is below.\n\n---------------------\nContext: Thai legal domain\nQuery: {query_str}\nRetrieved Documents: {context_str}\n---------------------\n\n Using the provided context information and the list of retrieved documents, you will focus on selecting the documents that are most relevant to the user's query. This selection process involves evaluating the content of each document for its pertinency to the query, ensuring that the response is based on accurate and contextually appropriate information.Based on the selected documents, you will synthesize a response that addresses the user's query, drawing directly from the content of these documents to provide a precise, legally informed answer.You must answer in Thai.\nAnswer:"

EN_SYSTEM_PROMPT_STR = """You are a legal assistant named Sommai (สมหมาย in Thai). You provide legal advice in a friendly, clear, and approachable manner. When answering questions, you reference the relevant law sections, including the name of the act or code they are from. You explain what these sections entail, including any associated punishments, fees, or obligations. Your tone is polite yet informal, making users feel comfortable, like consulting a trusted friend. If a question falls outside your knowledge, you must respond with the exact phrase: 'สมหมายไม่สามารถตอบคำถามนี้ได้ครับ'. You avoid making up information and guide users based on accurate legal references relevant to their situation. Where applicable, you provide practical advice, such as preparing documents, seeking medical attention, or contacting authorities. If asked about past Supreme Court judgments, you must state that you do not have information on those judgments at this time."""

def format(example):
    if "คำตอบ: " in example["positive_answer"]:
        example["positive_answer"] = example["positive_answer"].replace("คำตอบ: ", "")
    if example['positive_contexts']:
        context = ''.join([v['text'] for v in example['positive_contexts'][:5]])
        message = [
            {"content": EN_SYSTEM_PROMPT_STR, "role": "system"}, 
            {"content": EN_QA_TEMPLATE.format(query_str=example['question'], context_str=context), "role": "user"}, 
        ]
    else:
        message = [
            {"content": EN_SYSTEM_PROMPT_STR, "role": "system"}, 
            {"content": EN_QA_TEMPLATE.format(query_str=example['question'], context_str=" "), "role": "user"}, 
        ]
    return dict(messages=message)
dataset = dataset.map(format, batched=False)

Training hyperparameters

We full fine-tuned Llama-3.1-8B using the following hyperparameters:

learning_rate: 0.0002
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 16
total_train_batch_size: 256
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 4

Total training time: 2:15:14.66

Evaluation

We tested our model based on the test set of the WangchanX Legal Thai CCL dataset using both traditional (MRC) metrics and a LLM as judge technique based on the paper CHIE: Generative MRC Evaluation for in-context QA with Correctness, Helpfulness, Irrelevancy, and Extraneousness Aspects

Note: LLaMa3.1-8B-Legal-ThaiCCL is trained on only positive contexts while LLaMa3.1-8B-Legal-ThaiCCL-Combine is trained on both positive and negative contexts

Table 1: MRC Results

Model	Context Type	Answer Type	ROUGE-L	Character Error Rate (CER)	Word Error Rate (WER)	BERT Score	F1-score XQuAD	Exact Match XQuAD
Zero-shot LLaMa3.1-8B-Instruct	Golden Passage	Only Positive	0.553	1.181	1.301	0.769	48.788	0.0
LLaMa3.1-8B-Legal-ThaiCCL	Golden Passage	Only Positive	0.603	0.667	0.736	0.821	60.039	0.053
LLaMa3.1-8B-Legal-ThaiCCL-Combine	Golden Passage	Only Positive	0.715	0.695	0.758	0.833	64.578	0.614
Zero-shot LLaMa3.1-70B-Instruct	Golden Passage	Only Positive	0.830	0.768	0.848	0.830	61.497	0.0
Zero-shot LLaMa3.1-8B-Instruct	Retrieval Passage	Only Positive	0.422	1.631	1.773	0.757	39.639	0.0
LLaMa3.1-8B-Legal-ThaiCCL	Retrieval Passage	Only Positive	0.366	1.078	1.220	0.779	44.238	0.03
LLaMa3.1-8B-Legal-ThaiCCL-Combine	Retrieval Passage	Only Positive	0.516	0.884	0.884	0.816	54.948	0.668
Zero-shot LLaMa3.1-70B-Instruct	Retrieval Passage	Only Positive	0.616	0.934	1.020	0.816	54.930	0.0

Table 2: CHIE Results

Model	Context Type	Answer Type	Q1: Correctness [H]	Q2: Helpfulness [H]	Q3: Irrelevancy [L]	Q4: Out-of-Context [L]
Zero-shot LLaMa3.1-8B-Instruct	Golden Passage	Only Positive	0.740	0.808	0.480	0.410
LLaMa3.1-8B-Legal-ThaiCCL	Golden Passage	Only Positive	0.705	0.486	0.294	0.208
LLaMa3.1-8B-Legal-ThaiCCL-Combine	Golden Passage	Only Positive	0.565	0.468	0.405	0.325
Zero-shot LLaMa3.1-70B-Instruct	Golden Passage	Only Positive	0.870	0.658	0.316	0.247
Zero-shot LLaMa3.1-8B-Instruct	Retrieval Passage	Only Positive	0.480	0.822	0.557	0.248
LLaMa3.1-8B-Legal-ThaiCCL	Retrieval Passage	Only Positive	0.274	0.470	0.720	0.191
LLaMa3.1-8B-Legal-ThaiCCL-Combine	Retrieval Passage	Only Positive	0.532	0.445	0.508	0.203
Zero-shot LLaMa3.1-70B-Instruct	Retrieval Passage	Only Positive	0.748	0.594	0.364	0.202

airesearch
/

LLaMa3.1-8B-Legal-ThaiCCL-Combine

Llama-3.1-Legal-ThaiCCL-8B

Model description

Model Usage

Training Data

Prompt Format

Training hyperparameters

Evaluation

Table 1: MRC Results

Table 2: CHIE Results

License and use

Model tree for airesearch/LLaMa3.1-8B-Legal-ThaiCCL-Combine

Dataset used to train airesearch/LLaMa3.1-8B-Legal-ThaiCCL-Combine

Evaluation results