leaderboard-pr-bot's picture
Adding Evaluation Results
fc184b3 verified
|
raw
history blame
7.06 kB
metadata
language:
  - th
  - en
license: apache-2.0
library_name: transformers
tags:
  - openthaigpt
  - llama
pipeline_tag: text-generation
model-index:
  - name: openthaigpt-1.0.0-beta-13b-chat-hf
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 53.58
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-13b-chat-hf
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 79.09
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-13b-chat-hf
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 51.13
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-13b-chat-hf
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 44.16
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-13b-chat-hf
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 73.88
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-13b-chat-hf
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 0.83
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=openthaigpt/openthaigpt-1.0.0-beta-13b-chat-hf
          name: Open LLM Leaderboard

🇹🇭 OpenThaiGPT 13b 1.0.0-beta Chat with 16 bits in Huggingface's format.

🇹🇭 OpenThaiGPT 13b Version 1.0.0-beta is a Thai language 13B-parameter LLaMA v2 Chat model finetuned to follow Thai translated instructions and extend more than 10,000 most popular Thai words vocabularies into LLM's dictionary for turbo speed.

Licenses

Source Code: License Apache Software License 2.0.
Weight: Research and Commercial uses.

Codes and Weight

Finetune Code: https://github.com/OpenThaiGPT/openthaigpt-finetune-010beta
Inference Code: https://github.com/OpenThaiGPT/openthaigpt
Weight (Huggingface Checkpoint): https://huggingface.co/openthaigpt/openthaigpt-1.0.0-beta-13b-chat-hf

Sponsors

Supports

Description

Prompt format is Llama2

<s>[INST] <<SYS>>
system_prompt
<</SYS>>

question [/INST]

System prompt: You are a question answering assistant. Answer the question as truthful and helpful as possible คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด

How to use

  1. install VLLM (https://github.com/vllm-project/vllm)
  2. python -m vllm.entrypoints.api_server --model /path/to/model --tensor-parallel-size num_gpus
  3. run inference (CURL example)
curl --request POST \
    --url http://localhost:8000/generate \
    --header "Content-Type: application/json" \
    --data '{"prompt": "<s>[INST] <<SYS>>\nYou are a question answering assistant. Answer the question as truthful and helpful as possible คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด\n<</SYS>>\n\nอยากลดความอ้วนต้องทำอย่างไร [/INST]","use_beam_search": false, "temperature": 0.1, "max_tokens": 512, "top_p": 0.75, "top_k": 40, "frequency_penalty": 0.3 "stop": "</s>"}'

Authors

Disclaimer: Provided responses are not guaranteed.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 50.45
AI2 Reasoning Challenge (25-Shot) 53.58
HellaSwag (10-Shot) 79.09
MMLU (5-Shot) 51.13
TruthfulQA (0-shot) 44.16
Winogrande (5-shot) 73.88
GSM8k (5-shot) 0.83