--- language: - th - en license: apache-2.0 library_name: transformers tags: - openthaigpt - llama datasets: - kobkrit/rd-taxqa - iapp_wiki_qa_squad - Thaweewat/alpaca-cleaned-52k-th - Thaweewat/instruction-wild-52k-th - Thaweewat/databricks-dolly-15k-th - Thaweewat/hc3-24k-th - Thaweewat/gpteacher-20k-th - Thaweewat/onet-m6-social - Thaweewat/alpaca-finance-43k-th pipeline_tag: text-generation model-index: - name: openthaigpt-1.0.0-beta-7b-chat-ckpt-hf results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 44.97 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 70.19 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 36.22 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 49.99 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 69.38 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 1.36 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf name: Open LLM Leaderboard --- # 🇹🇭 OpenThaiGPT 1.0.0-beta 🇹🇭 OpenThaiGPT Version 1.0.0-beta is a Thai language 7B-parameter LLaMA v2 Chat model finetuned to follow Thai translated instructions and extend more than 24,500 most popular Thai words vocabularies into LLM's dictionary for turbo speed. ## Upgrade from OpenThaiGPT 1.0.0-alpha - Add more than 24,500 most popular Thai words vocabularies into LLM's dictionary and re-pretrain embedding layers which make it generate Thai text 10 times faster than previous version. ## Support - Official website: https://openthaigpt.aieat.or.th - Facebook page: https://web.facebook.com/groups/openthaigpt - A Discord server for discussion and support [here](https://discord.gg/rUTp6dfVUF) - E-mail: kobkrit@iapp.co.th ## License **Source Code**: License Apache Software License 2.0.
**Weight**: Research and **Commercial uses**.
## Code and Weight **Colab Demo**: https://colab.research.google.com/drive/1kDQidCtY9lDpk49i7P3JjLAcJM04lawu?usp=sharing
**Finetune Code**: https://github.com/OpenThaiGPT/openthaigpt-finetune-010beta
**Inference Code**: https://github.com/OpenThaiGPT/openthaigpt
**Weight (Huggingface Checkpoint)**: https://huggingface.co/openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf ## Sponsors Pantip.com, ThaiSC
### Powered by OpenThaiGPT Volunteers, Artificial Intelligence Entrepreneur Association of Thailand (AIEAT), and Artificial Intelligence Association of Thailand (AIAT) ### Authors * Kobkrit Viriyayudhakorn (kobkrit@aieat.or.th) * Sumeth Yuenyong (sumeth.yue@mahidol.edu) * Thaweewat Rugsujarit (thaweewr@scg.com) * Jillaphat Jaroenkantasima (autsadang41@gmail.com) * Norapat Buppodom (new@norapat.com) * Koravich Sangkaew (kwankoravich@gmail.com) * Peerawat Rojratchadakorn (peerawat.roj@gmail.com) * Surapon Nonesung (nonesungsurapon@gmail.com) * Chanon Utupon (chanon.utupon@gmail.com) * Sadhis Wongprayoon (sadhis.tae@gmail.com) * Nucharee Thongthungwong (nuchhub@hotmail.com) * Chawakorn Phiantham (mondcha1507@gmail.com) * Patteera Triamamornwooth (patt.patteera@gmail.com) * Nattarika Juntarapaoraya (natt.juntara@gmail.com) * Kriangkrai Saetan (kraitan.ss21@gmail.com) * Pitikorn Khlaisamniang (pitikorn32@gmail.com) Disclaimer: Provided responses are not guaranteed. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_openthaigpt__openthaigpt-1.0.0-beta-7b-chat-ckpt-hf) | Metric |Value| |---------------------------------|----:| |Avg. |45.35| |AI2 Reasoning Challenge (25-Shot)|44.97| |HellaSwag (10-Shot) |70.19| |MMLU (5-Shot) |36.22| |TruthfulQA (0-shot) |49.99| |Winogrande (5-shot) |69.38| |GSM8k (5-shot) | 1.36|