WestLake-7B-v2 / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
c2fe808 verified
|
raw
history blame
7.61 kB
metadata
language:
  - en
license: apache-2.0
library_name: transformers
model-index:
  - name: WestLake-7B-v2
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 73.04
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=senseable/WestLake-7B-v2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 88.65
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=senseable/WestLake-7B-v2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 64.71
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=senseable/WestLake-7B-v2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 67.06
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=senseable/WestLake-7B-v2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 86.98
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=senseable/WestLake-7B-v2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 67.63
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=senseable/WestLake-7B-v2
          name: Open LLM Leaderboard

image/png

Update Notes: Version 2 trained 1 additional epoch cycle for 3 total

Westlake-7Bv2: Role-Play & Text Generation Specialist Model

Welcome to the documentation of Westlake-7B, a cutting-edge language model designed for exceptional role-play and text generation tasks. This README file aims to provide an overview of our capabilities, usage guidelines, and potential applications.

About Westlake-7Bv2

Westlake-7B is built upon a vast corpus of diverse texts, enabling it to generate contextually relevant responses in various scenarios. With its impressive size of 7 billion parameters, this model excels at understanding nuances in language and producing creative outputs.

Key Features

  1. Role-Play: Westlake-7Bv2 can seamlessly adapt to different character personas and engage in dynamic conversations while maintaining consistency throughout the interaction. It can generate believable dialogues across various genres, including fiction, non-fiction, historical events, or even fantasy worlds.
  2. Text Generation: This model is proficient at generating original content such as stories, poems, essays, news articles, and more. Its ability to capture the essence of different writing styles makes it an ideal tool for creative writers seeking inspiration or assistance in their projects.
  3. Contextual Understanding: Westlake-7B's extensive training allows it to comprehend complex contexts and generate responses that align with given situations. It can handle multiple topics simultaneously, making it versatile across various applications.
  4. Continuous Learning: As a language model, Westlake-7B continuously improves its performance through ongoing training on new data sets. This ensures its capabilities remain up-to-date and relevant in an ever-evolving world of communication.

Usage Guidelines

To utilize Westlake-7Bv2 for your projects or experiments, follow these steps:

  1. Prompting: Provide clear and concise prompts that outline the desired role-play scenario or text generation task. The quality of output depends heavily on the clarity and relevance of input instructions.
  2. Feedback Loop: For optimal results, consider incorporating a feedback loop into your application to refine generated outputs based on user preferences or additional contextual information. This iterative process can significantly enhance the model's performance in specific domains.
  3. Ethical Considerations: As with any AI system, ensure responsible usage of Westlake-7B by avoiding harmful content generation or misuse of its capabilities.

Potential Applications

Westlake-7Bv2's versatility makes it suitable for various applications across different industries:

  1. Creative Writing: Assist authors in generating new ideas, expanding storylines, or even completing drafts by providing creative suggestions and textual content.
  2. Education: Enhance language learning platforms with interactive role-play scenarios to improve students' communication skills and cultural understanding.
  3. Gaming: Integrate Westlake-7B into game engines for dynamic non-player character interactions or generating unique questlines based on player choices.
  4. Customer Support: Leverage the model's conversational abilities to create chatbots capable of handling complex queries and providing personalized assistance.
  5. Social Media: Develop applications that generate engaging content such as captions, status updates, or even entire posts tailored to users' preferences and interests.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 74.68
AI2 Reasoning Challenge (25-Shot) 73.04
HellaSwag (10-Shot) 88.65
MMLU (5-Shot) 64.71
TruthfulQA (0-shot) 67.06
Winogrande (5-shot) 86.98
GSM8k (5-shot) 67.63

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 74.68
AI2 Reasoning Challenge (25-Shot) 73.04
HellaSwag (10-Shot) 88.65
MMLU (5-Shot) 64.71
TruthfulQA (0-shot) 67.06
Winogrande (5-shot) 86.98
GSM8k (5-shot) 67.63