Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

.gitattributes +1 -0
README.md +226 -0
l3.1-niitorm-8b-dpo-t0.0001.Q4_0.gguf +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+l3.1-niitorm-8b-dpo-t0.0001.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,226 @@

+---
+library_name: transformers
+tags:
+- merge
+- llama
+- dpo
+base_model:
+- akjindal53244/Llama-3.1-Storm-8B
+- Sao10K/L3.1-8B-Niitama-v1.1
+- v000000/L3.1-Niitorm-8B-t0.0001
+datasets:
+- jondurbin/gutenberg-dpo-v0.1
+model-index:
+- name: L3.1-Niitorm-8B-DPO-t0.0001
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: IFEval (0-Shot)
+      type: HuggingFaceH4/ifeval
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: inst_level_strict_acc and prompt_level_strict_acc
+      value: 76.89
+      name: strict accuracy
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=v000000/L3.1-Niitorm-8B-DPO-t0.0001
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: BBH (3-Shot)
+      type: BBH
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc_norm
+      value: 30.51
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=v000000/L3.1-Niitorm-8B-DPO-t0.0001
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MATH Lvl 5 (4-Shot)
+      type: hendrycks/competition_math
+      args:
+        num_few_shot: 4
+    metrics:
+    - type: exact_match
+      value: 14.88
+      name: exact match
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=v000000/L3.1-Niitorm-8B-DPO-t0.0001
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GPQA (0-shot)
+      type: Idavidrein/gpqa
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: acc_norm
+      value: 5.93
+      name: acc_norm
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=v000000/L3.1-Niitorm-8B-DPO-t0.0001
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MuSR (0-shot)
+      type: TAUR-Lab/MuSR
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: acc_norm
+      value: 7.26
+      name: acc_norm
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=v000000/L3.1-Niitorm-8B-DPO-t0.0001
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU-PRO (5-shot)
+      type: TIGER-Lab/MMLU-Pro
+      config: main
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 31.85
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=v000000/L3.1-Niitorm-8B-DPO-t0.0001
+      name: Open LLM Leaderboard
+---
+# Llama-3.1-Niitorm-8B-DPO
+* *DPO Trained, Llama3.1-8B.*
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/QeNjtwolNpxUmpo9NL7VI.png)
+<b>New: DPO'd Gutenberg Version (full epoch training).</b>
+RP model, Niitama 1.1 as a base, nearswapped with one of the smartest 3.1 models "Storm", then DPO'd, mostly abliterated.
+Essentially, it's an improved Niitama 1.1
+-------------------------------------------------------------------------------
+*Gutenberg DPO creates more human-like prose/story writing and greately lessen synthetic feeling outputs.*
+-------------------------------------------------------------------------------
+# *llama.cpp:*
+# thank you, mradermacher (GGUF)
+* [GGUF static](https://huggingface.co/mradermacher/L3.1-Niitorm-8B-DPO-t0.0001-GGUF)
+* [GGUF Imatrix](https://huggingface.co/mradermacher/L3.1-Niitorm-8B-DPO-t0.0001-i1-GGUF)
+# thank you, QuantFactory (GGUF)
+* [GGUF static](https://huggingface.co/QuantFactory/L3.1-Niitorm-8B-DPO-t0.0001-GGUF)
+# v0 (GGUF)
+* [GGUF Imatrix](https://huggingface.co/v000000/L3.1-Niitorm-8B-DPO-t0.0001-GGUFs-IMATRIX) *-only q8, q6 k, q5 k s, q4 k s, iq4 x s*
+## Finetune and merge
+This is a merge and finetune of pre-trained language models.
+*Resultant merge finetuned* on [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1) for 1 epoch, 1.5e-5 learning rate, on Nvidia A100.
+## Merge Details
+### Merge Method
+This model was merged using the <b>NEARSWAP t0.0001</b> merge algorithm.
+### Models Merged
+The following models were included in the merge:
+* Base Model: [Sao10K/L3.1-8B-Niitama-v1.1](https://huggingface.co/Sao10K/L3.1-8B-Niitama-v1.1) + [grimjim/Llama-3-Instruct-abliteration-LoRA-8B](https://huggingface.co/grimjim/Llama-3-Instruct-abliteration-LoRA-8B)
+* [akjindal53244/Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)
+### Configuration
+The following YAML configuration was used to produce this model:
+```yaml
+slices:
+  - sources:
+      - model: Sao10K/L3.1-8B-Niitama-v1.1+grimjim/Llama-3-Instruct-abliteration-LoRA-8B
+        layer_range: [0, 32]
+      - model: akjindal53244/Llama-3.1-Storm-8B
+        layer_range: [0, 32]
+merge_method: nearswap
+base_model: Sao10K/L3.1-8B-Niitama-v1.1+grimjim/Llama-3-Instruct-abliteration-LoRA-8B
+parameters:
+  t:
+    - value: 0.0001
+dtype: float16
+# Then, DPO Finetune
+# [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1)
+```
+### DPO Notes
+*I used a higher learning rate and full dataset when training compared to my "L3.1-Celestial-Stone-2x8B-DPO". This caused lower loss and better adaption to the chosen style.*
+-------------------------------------------------------------------------------
+# Prompt Template:
+```bash
+<|begin_of_text|><|start_header_id|>system<|end_header_id|>
+{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
+{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
+{output}<|eot_id|>
+```
+Credit to Alchemonaut.
+Credit to Sao10K.
+Credit to Grimjim.
+Credit to mlabonne.
+Credit to jondurbin.
+Credit to woofwolfy.
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_v000000__L3.1-Niitorm-8B-DPO-t0.0001)
+|      Metric       |Value|
+|-------------------|----:|
+|Avg.               |27.89|
+|IFEval (0-Shot)    |76.89|
+|BBH (3-Shot)       |30.51|
+|MATH Lvl 5 (4-Shot)|14.88|
+|GPQA (0-shot)      | 5.93|
+|MuSR (0-shot)      | 7.26|
+|MMLU-PRO (5-shot)  |31.85|

l3.1-niitorm-8b-dpo-t0.0001.Q4_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:577f4a2f81acc40db5df3e511e4f9fae2ea2f1055d77f89d7ae5f7e9b7feceba
+size 4661217056