Muhammad2003
/

Llama-3-8B-DPO-1500

@@ -3,232 +3,4 @@ language:
 - en
 license: llama3
 library_name: transformers
-tags:
-- axolotl
-- finetune
-- dpo
-- facebook
-- meta
-- pytorch
-- llama
-- llama-3
-base_model: meta-llama/Meta-Llama-3-8B-Instruct
-datasets:
-- Intel/orca_dpo_pairs
-model_name: Llama-3-8B-Instruct-DPO-v0.3
-pipeline_tag: text-generation
-license_name: llama3
-license_link: LICENSE
-inference: false
-model_creator: MaziyarPanahi
-quantized_by: MaziyarPanahi
-model-index:
-- name: Llama-3-8B-Instruct-DPO-v0.3
-  results:
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: AI2 Reasoning Challenge (25-Shot)
-      type: ai2_arc
-      config: ARC-Challenge
-      split: test
-      args:
-        num_few_shot: 25
-    metrics:
-    - type: acc_norm
-      value: 62.63
-      name: normalized accuracy
-    source:
-      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: HellaSwag (10-Shot)
-      type: hellaswag
-      split: validation
-      args:
-        num_few_shot: 10
-    metrics:
-    - type: acc_norm
-      value: 79.2
-      name: normalized accuracy
-    source:
-      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: MMLU (5-Shot)
-      type: cais/mmlu
-      config: all
-      split: test
-      args:
-        num_few_shot: 5
-    metrics:
-    - type: acc
-      value: 68.33
-      name: accuracy
-    source:
-      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: TruthfulQA (0-shot)
-      type: truthful_qa
-      config: multiple_choice
-      split: validation
-      args:
-        num_few_shot: 0
-    metrics:
-    - type: mc2
-      value: 53.29
-    source:
-      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: Winogrande (5-shot)
-      type: winogrande
-      config: winogrande_xl
-      split: validation
-      args:
-        num_few_shot: 5
-    metrics:
-    - type: acc
-      value: 75.37
-      name: accuracy
-    source:
-      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: GSM8k (5-shot)
-      type: gsm8k
-      config: main
-      split: test
-      args:
-        num_few_shot: 5
-    metrics:
-    - type: acc
-      value: 70.58
-      name: accuracy
-    source:
-      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
-      name: Open LLM Leaderboard
----
-<img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
-# Llama-3-8B-Instruct-DPO-v0.3 (32k)
-This model is a fine-tune (DPO) of `meta-llama/Meta-Llama-3-8B-Instruct` model. I have used `rope_theta` to extend the context length up to 32K safely.
-# Quantized GGUF
-All GGUF models come with context length of `32000`: [Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF)
-# Prompt Template
-This model uses `ChatML` prompt template:
-```
-<|im_start|>system
-{System}
-<|im_end|>
-<|im_start|>user
-{User}
-<|im_end|>
-<|im_start|>assistant
-{Assistant}
-````
-# How to use
-You can use this model by using `MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3` as the model name in Hugging Face's
-transformers library.
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
-from transformers import pipeline
-import torch
-model_id = "MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3"
-model = AutoModelForCausalLM.from_pretrained(
-    model_id,
-    torch_dtype=torch.bfloat16,
-    device_map="auto",
-    trust_remote_code=True,
-    # attn_implementation="flash_attention_2"
-)
-tokenizer = AutoTokenizer.from_pretrained(
-    model_id,
-    trust_remote_code=True
-)
-streamer = TextStreamer(tokenizer)
-pipeline = pipeline(
-    "text-generation",
-    model=model,
-    tokenizer=tokenizer,
-    model_kwargs={"torch_dtype": torch.bfloat16},
-    streamer=streamer
-)
-# Then you can use the pipeline to generate text.
-messages = [
-    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
-    {"role": "user", "content": "Who are you?"},
-]
-prompt = tokenizer.apply_chat_template(
-    messages,
-    tokenize=False,
-    add_generation_prompt=True
-)
-terminators = [
-    tokenizer.eos_token_id,
-    tokenizer.convert_tokens_to_ids("<|im_end|>")
-]
-outputs = pipeline(
-    prompt,
-    max_new_tokens=8192,
-    eos_token_id=terminators,
-    do_sample=True,
-    temperature=0.6,
-    top_p=0.95,
-)
-print(outputs[0]["generated_text"][len(prompt):])
-```
-# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
-Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__Llama-3-8B-Instruct-DPO-v0.3)
-|             Metric              |Value|
-|---------------------------------|----:|
-|Avg.                             |68.23|
-|AI2 Reasoning Challenge (25-Shot)|62.63|
-|HellaSwag (10-Shot)              |79.20|
-|MMLU (5-Shot)                    |68.33|
-|TruthfulQA (0-shot)              |53.29|
-|Winogrande (5-shot)              |75.37|
-|GSM8k (5-shot)                   |70.58|

 - en
 license: llama3
 library_name: transformers
+---