language:
- th
- en
license: apache-2.0
library_name: transformers
tags:
- openthaigpt
- llama
datasets:
- kobkrit/rd-taxqa
- iapp_wiki_qa_squad
- Thaweewat/alpaca-cleaned-52k-th
- Thaweewat/instruction-wild-52k-th
- Thaweewat/databricks-dolly-15k-th
- Thaweewat/hc3-24k-th
- Thaweewat/gpteacher-20k-th
- Thaweewat/onet-m6-social
- Thaweewat/alpaca-finance-43k-th
pipeline_tag: text-generation
model-index:
- name: openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 44.97
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 70.19
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 36.22
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 49.99
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 69.38
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 1.36
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
name: Open LLM Leaderboard
๐น๐ญ OpenThaiGPT 1.0.0-beta
๐น๐ญ OpenThaiGPT Version 1.0.0-beta is a Thai language 7B-parameter LLaMA v2 Chat model finetuned to follow Thai translated instructions and extend more than 24,500 most popular Thai words vocabularies into LLM's dictionary for turbo speed.
Upgrade from OpenThaiGPT 1.0.0-alpha
- Add more than 24,500 most popular Thai words vocabularies into LLM's dictionary and re-pretrain embedding layers which make it generate Thai text 10 times faster than previous version.
Support
- Official website: https://openthaigpt.aieat.or.th
- Facebook page: https://web.facebook.com/groups/openthaigpt
- A Discord server for discussion and support here
- E-mail: [email protected]
License
Source Code: License Apache Software License 2.0.
Weight: Research and Commercial uses.
Code and Weight
Colab Demo: https://colab.research.google.com/drive/1kDQidCtY9lDpk49i7P3JjLAcJM04lawu?usp=sharing
Finetune Code: https://github.com/OpenThaiGPT/openthaigpt-finetune-010beta
Inference Code: https://github.com/OpenThaiGPT/openthaigpt
Weight (Huggingface Checkpoint): https://huggingface.co/openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
Sponsors
Pantip.com, ThaiSC
Powered by
OpenThaiGPT Volunteers, Artificial Intelligence Entrepreneur Association of Thailand (AIEAT), and Artificial Intelligence Association of Thailand (AIAT)
Authors
- Kobkrit Viriyayudhakorn ([email protected])
- Sumeth Yuenyong ([email protected])
- Thaweewat Rugsujarit ([email protected])
- Jillaphat Jaroenkantasima ([email protected])
- Norapat Buppodom ([email protected])
- Koravich Sangkaew ([email protected])
- Peerawat Rojratchadakorn ([email protected])
- Surapon Nonesung ([email protected])
- Chanon Utupon ([email protected])
- Sadhis Wongprayoon ([email protected])
- Nucharee Thongthungwong ([email protected])
- Chawakorn Phiantham ([email protected])
- Patteera Triamamornwooth ([email protected])
- Nattarika Juntarapaoraya ([email protected])
- Kriangkrai Saetan ([email protected])
- Pitikorn Khlaisamniang ([email protected])
Disclaimer: Provided responses are not guaranteed.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 45.35 |
AI2 Reasoning Challenge (25-Shot) | 44.97 |
HellaSwag (10-Shot) | 70.19 |
MMLU (5-Shot) | 36.22 |
TruthfulQA (0-shot) | 49.99 |
Winogrande (5-shot) | 69.38 |
GSM8k (5-shot) | 1.36 |