revert to origin

Browse files

Files changed (9) hide show

README.md +11 -23
config.json +5 -5
generation_config.json +3 -3
model-00001-of-00002.safetensors +2 -2
model-00002-of-00002.safetensors +1 -1
model.safetensors.index.json +1 -1
special_tokens_map.json +15 -19
tokenizer.json +2 -2
tokenizer_config.json +4 -28

README.md CHANGED Viewed

@@ -3,7 +3,8 @@ base_model: google/gemma-2-2b-jpn-it
 language:
 - multilingual
 datasets:
-  - mlabonne/orpo-dpo-mix-40k
 library_name: transformers
 license: gemma
 license_link: https://ai.google.dev/gemma/terms
@@ -39,21 +40,13 @@ described by mlabonne.
 Layer 17 of the original model was chosen for abliteration.
 I also created another layer 18 abliterated model for comparison.
-ORPO fine tuning was performed for eight epoches.
-| Epoch | loss | eval_loss |
-| ----- | ---- | --------- |
-| 1 | 1.20152769684791564 | 1.0501047372817993 |
-| 2 | 1.25755584239959716 | 1.0144596099853516 |
-| 3 | 0.93099724054336543 | 0.9957754611968994 |
-| 4 | 0.88664623498916623 | 0.9857067465782166 |
-| 5 | 0.86961059570312504 | 1.0203918218612670 |
-| 6 | 0.98065975904464630 | 0.9958684444427490 |
-| 7 | 0.38512575328350068 | 0.9686505198478699 |
-| 8 | 1.41178888082504270 | 0.9652527570724487 |
-The fine tuned model is uploaded here to be evaluated by the Open LLM Leaderboard to see if the slightly brain damaged non-ORPO model can be healed. Again, the fine tuning method is also based on one described by [mlabonne](https://towardsdatascience.com/fine-tune-llama-3-with-orpo-56cfab2f9ada) but the input model was read into VRAM by [unsloth](https://github.com/unslothai/unsloth) to allow using the full 40k dataset to run on a single 3090.
 ## Benchmark (100.0*raw scores only)
@@ -62,13 +55,10 @@ Click on the model name go to the raw score json generated by Open LLM Leaderboa
 | Model | Average | IFEval | BHH | Math Lv5 | GPQA | MUSR | MMLU-PRO |
 | ----- | ------- | ------ | ----|--------- | ---- | ---- | -------- |
 | [gemma-2-2b-jpn-it](https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/google/gemma-2-2b-jpn-it/results_2024-10-15T15-21-39.173019.json) | 30.82 | 54.11 | 41.43 | 0.0 | 27.52 | 37.17 | 24.67 |
-| [gemma-2-2b-jpn-it-abliterated-17-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO/results_2024-10-20T02-46-59.069357.json) | 29.99 | 50.94 | 38.59 | 2.87 | 27.43 | 38.23 | 21.86 |
-| gemma-2-2b-jpn-it-abliterated-17-ORPO (8 epoches) | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
-| [gemma-2-2b-jpn-it-abliterated-18-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18-ORPO/results_2024-10-22T04-04-56.385050.json) | 29.94 | 48.97 | 40.18 | 3.02 | 26.17 | 39.42 | 21.85 |
 | [gemma-2-2b-jpn-it-abliterated-17](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17/results_2024-10-18T15-18-46.821674.json) | 30.29 | 52.65 | 40.46 | 0.0 | 27.18 | 36.90 | 24.55 |
 | [gemma-2-2b-jpn-it-abliterated-18](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18/results_2024-10-18T15-41-42.399571.json) | 30.61 | 53.02 | 40.96 | 0.0 | 27.35 | 37.30 | 25.05 |
-Looks like fine tuning is probably not enough. May need to run more epoches.
 ## How to run this model
@@ -77,7 +67,7 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
 import transformers
 import torch
-model_id = "gemma-2-2b-jpn-it-abliterated-17-ORPO"
 dtype = torch.bfloat16
 tokenizer = AutoTokenizer.from_pretrained(model_id)
@@ -103,11 +93,9 @@ pip install -U "huggingface_hub[cli]"
 Then, you can target the specific file you want:
 ```
-huggingface-cli download ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO --include "*" --local-dir ./
 ```
 ## Credits
-Thank you mlabonne for describing his fine tuning method.
-Thanks FullOf_Bad_Ideas from LocalLlama for the suggestion of using unsloth to save VRAM.

 language:
 - multilingual
 datasets:
+  - mlabonne/harmless_alpaca
+  - mlabonne/harmful_behaviors
 library_name: transformers
 license: gemma
 license_link: https://ai.google.dev/gemma/terms
 Layer 17 of the original model was chosen for abliteration.
 I also created another layer 18 abliterated model for comparison.
+These two layers were chosen due to they both produce uncensored response
+after respective layer was abliterated.
+It is uploaded here to be evaluated by the Open LLM Leaderboard to see how brain damaged it
+is compared to the original model.
+ORPO fine tuning is currently underway to see if it can regain its sanity. You can play with this model first or wait until I am done with the fine tuning.
 ## Benchmark (100.0*raw scores only)
 | Model | Average | IFEval | BHH | Math Lv5 | GPQA | MUSR | MMLU-PRO |
 | ----- | ------- | ------ | ----|--------- | ---- | ---- | -------- |
 | [gemma-2-2b-jpn-it](https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/google/gemma-2-2b-jpn-it/results_2024-10-15T15-21-39.173019.json) | 30.82 | 54.11 | 41.43 | 0.0 | 27.52 | 37.17 | 24.67 |
 | [gemma-2-2b-jpn-it-abliterated-17](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17/results_2024-10-18T15-18-46.821674.json) | 30.29 | 52.65 | 40.46 | 0.0 | 27.18 | 36.90 | 24.55 |
 | [gemma-2-2b-jpn-it-abliterated-18](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18/results_2024-10-18T15-41-42.399571.json) | 30.61 | 53.02 | 40.96 | 0.0 | 27.35 | 37.30 | 25.05 |
+It is only slightly dumber than the original.
 ## How to run this model
 import transformers
 import torch
+model_id = "gemma-2-2b-jpn-it-abliterated-17"
 dtype = torch.bfloat16
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 Then, you can target the specific file you want:
 ```
+huggingface-cli download ymcki/gemma-2-2b-jpn-it-abliterated-17 --include "*" --local-dir ./
 ```
 ## Credits
+Thank you mlabonne for describing his abliteration method.

config.json CHANGED Viewed

@@ -1,15 +1,15 @@
 {
-  "_name_or_path": "gemma-2-2b-jpn-it-abliterated-17",
   "architectures": [
     "Gemma2ForCausalLM"
   ],
   "attention_bias": false,
   "attention_dropout": 0.0,
   "attn_logit_softcapping": 50.0,
-  "bos_token_id": 256000,
   "cache_implementation": "hybrid",
   "dtype": "bfloat16",
-  "eos_token_id": 256001,
   "final_logit_softcapping": 30.0,
   "head_dim": 256,
   "hidden_activation": "gelu_pytorch_tanh",
@@ -21,7 +21,7 @@
   "num_attention_heads": 8,
   "num_hidden_layers": 26,
   "num_key_value_heads": 4,
-  "pad_token_id": 256001,
   "query_pre_attn_scalar": 224,
   "rms_norm_eps": 1e-06,
   "rope_theta": 10000.0,
@@ -29,5 +29,5 @@
   "torch_dtype": "bfloat16",
   "transformers_version": "4.45.2",
   "use_cache": true,
-  "vocab_size": 256002
 }

 {
+  "_name_or_path": "google/gemma-2-2b-it",
   "architectures": [
     "Gemma2ForCausalLM"
   ],
   "attention_bias": false,
   "attention_dropout": 0.0,
   "attn_logit_softcapping": 50.0,
+  "bos_token_id": 2,
   "cache_implementation": "hybrid",
   "dtype": "bfloat16",
+  "eos_token_id": 1,
   "final_logit_softcapping": 30.0,
   "head_dim": 256,
   "hidden_activation": "gelu_pytorch_tanh",
   "num_attention_heads": 8,
   "num_hidden_layers": 26,
   "num_key_value_heads": 4,
+  "pad_token_id": 0,
   "query_pre_attn_scalar": 224,
   "rms_norm_eps": 1e-06,
   "rope_theta": 10000.0,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.45.2",
   "use_cache": true,
+  "vocab_size": 256000
 }

generation_config.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
   "_from_model_config": true,
-  "bos_token_id": 256000,
   "cache_implementation": "hybrid",
-  "eos_token_id": 256001,
-  "pad_token_id": 256001,
   "transformers_version": "4.45.2"
 }

 {
   "_from_model_config": true,
+  "bos_token_id": 2,
   "cache_implementation": "hybrid",
+  "eos_token_id": 1,
+  "pad_token_id": 0,
   "transformers_version": "4.45.2"
 }

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8890898ad4754962d00ab14a138953e42e22e8fec43330291dad46ec7c7b44c7
-size 4988034976

 version https://git-lfs.github.com/spec/v1
+oid sha256:31d0fe28f07c45851e212d86ef302d16b628e438ba664a2e9ce3d880b6c3636e
+size 4988025760

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d886f8d30ac1e1cbe189211fd61c2ed1e92280169f1df9b8e80dbd057cdd123d
 size 240691728

 version https://git-lfs.github.com/spec/v1
+oid sha256:e2014a813a43e0f5e16de1a357163996b5e7ce473e0b9f54c22665aa6542f008
 size 240691728

model.safetensors.index.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "metadata": {
-    "total_size": 5228692992
   },
   "weight_map": {
     "model.embed_tokens.weight": "model-00001-of-00002.safetensors",

 {
   "metadata": {
+    "total_size": 5228683776
   },
   "weight_map": {
     "model.embed_tokens.weight": "model-00001-of-00002.safetensors",

special_tokens_map.json CHANGED Viewed

@@ -1,23 +1,19 @@
 {
-  "additional_special_tokens": [
-    {
-      "content": "<|im_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false
-    },
-    {
-      "content": "<|im_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false
-    }
-  ],
-  "bos_token": "<|im_start|>",
-  "eos_token": "<|im_end|>",
-  "pad_token": "<|im_end|>",
   "unk_token": {
     "content": "<unk>",
     "lstrip": false,

 {
+  "bos_token": {
+    "content": "<bos>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<eos>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "<eos>",
   "unk_token": {
     "content": "<unk>",
     "lstrip": false,

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:798ce05be77ad0639365c369e448442dd9a31f614551b88413d483c4d45c1839
-size 34363415

 version https://git-lfs.github.com/spec/v1
+oid sha256:a5d1608a9eb188afeeb8a532bace0a4bcdc70afd57041b8441ee1940e3b19f72
+size 34363039

tokenizer_config.json CHANGED Viewed

@@ -1993,38 +1993,14 @@
       "rstrip": false,
       "single_word": false,
       "special": false
-    },
-    "256000": {
-      "content": "<|im_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "256001": {
-      "content": "<|im_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
     }
   },
-  "additional_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>"
-  ],
-  "bos_token": "<|im_start|>",
-  "chat_template": "{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
   "clean_up_tokenization_spaces": false,
-  "eos_token": "<|im_end|>",
-  "max_length": null,
   "model_max_length": 1000000000000000019884624838656,
-  "pad_to_multiple_of": null,
-  "pad_token": "<|im_end|>",
-  "pad_token_type_id": 0,
-  "padding_side": "left",
   "sp_model_kwargs": {},
   "spaces_between_special_tokens": false,
   "tokenizer_class": "GemmaTokenizer",

       "rstrip": false,
       "single_word": false,
       "special": false
     }
   },
+  "bos_token": "<bos>",
+  "chat_template": "{{ bos_token }}{% if messages[0]['role'] == 'system' %}{{ raise_exception('System role not supported') }}{% endif %}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if (message['role'] == 'assistant') %}{% set role = 'model' %}{% else %}{% set role = message['role'] %}{% endif %}{{ '<start_of_turn>' + role + '\n' + message['content'] | trim + '<end_of_turn>\n' }}{% endfor %}{% if add_generation_prompt %}{{'<start_of_turn>model\n'}}{% endif %}",
   "clean_up_tokenization_spaces": false,
+  "eos_token": "<eos>",
   "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "<eos>",
   "sp_model_kwargs": {},
   "spaces_between_special_tokens": false,
   "tokenizer_class": "GemmaTokenizer",