llamaRAGdrama / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
d3e5cfa verified
|
raw
history blame
4.9 kB
metadata
license: apache-2.0
model-index:
  - name: llamaRAGdrama
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 72.01
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kevin009/llamaRAGdrama
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 88.83
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kevin009/llamaRAGdrama
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 64.5
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kevin009/llamaRAGdrama
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 70.24
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kevin009/llamaRAGdrama
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 86.66
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kevin009/llamaRAGdrama
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 65.66
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kevin009/llamaRAGdrama
          name: Open LLM Leaderboard

It remain factual and reliable even in dramatic situations.


Model Card for kevin009/llamaRAGdrama

Model Details

  • Model Name: kevin009/llamaRAGdrama
  • Model Type: Fine-tuned for Q&A, RAG.
  • Fine-tuning Objective: Synthesis text content in Q&A, RAG scenarios.

Intended Use

  • Applications: RAG, Q&A

Training Data

  • Sources: Includes a diverse dataset of dramatic texts, enriched with factual databases and reliable sources to train the model on generating content that remains true to real-world facts.
  • Preprocessing: In addition to removing non-content text, data was annotated to distinguish between purely creative elements and those that require factual accuracy, ensuring a balanced training approach.

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("kevin009/llamaRAGdrama")
model = AutoModelForCausalLM.from_pretrained("kevin009/llamaRAGdrama")

input_text = "Enter your prompt here"
input_tokens = tokenizer.encode(input_text, return_tensors='pt')
output_tokens = model.generate(input_tokens, max_length=100, num_return_sequences=1, temperature=0.9)
generated_text = tokenizer.decode(output_tokens[0], skip_special_tokens=True)

print(generated_text)

Replace "Enter your prompt here" with your starting text. Adjust temperature for creativity level.

Limitations and Biases

  • Content Limitation: While designed to be truthful, It may not be considered safe.
  • Biases: It may remain biases and inaccurate.

Licensing and Attribution

  • License: Apache-2.0

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 74.65
AI2 Reasoning Challenge (25-Shot) 72.01
HellaSwag (10-Shot) 88.83
MMLU (5-Shot) 64.50
TruthfulQA (0-shot) 70.24
Winogrande (5-shot) 86.66
GSM8k (5-shot) 65.66