Transformers
GGUF
Inference Endpoints
conversational
nbeerbower commited on
Commit
14f42e9
1 Parent(s): 0cafbc3

add model files

Browse files
README.md CHANGED
@@ -1,3 +1,102 @@
1
- ---
2
- license: llama3
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags: []
4
+ base_model:
5
+ - flammenai/Mahou-1.1-llama3-8B
6
+ datasets:
7
+ - flammenai/Grill-preprod-v1_chatML
8
+ license: llama3
9
+ ---
10
+
11
+ ![image/png](https://huggingface.co/flammenai/Mahou-1.0-mistral-7B/resolve/main/mahou1.png)
12
+
13
+ # Mahou-1.1-llama3-8B
14
+
15
+ Mahou is our attempt to build a production-ready conversational/roleplay LLM.
16
+
17
+ Future versions will be released iteratively and finetuned from flammen.ai conversational data.
18
+
19
+ ### Chat Format
20
+
21
+ This model has been trained to use ChatML format.
22
+
23
+ ```
24
+ <|im_start|>system
25
+ {{system}}<|im_end|>
26
+ <|im_start|>{{char}}
27
+ {{message}}<|im_end|>
28
+ <|im_start|>{{user}}
29
+ {{message}}<|im_end|>
30
+ ```
31
+
32
+ ### ST Settings
33
+
34
+ 1. Use ChatML for the Context Template.
35
+ 2. Turn on Instruct Mode for ChatML.
36
+ 3. Use the following stopping strings: `["<", "|", "<|", "\n"]`
37
+
38
+ ### License
39
+
40
+ This model is based on Meta Llama-3-8B and is governed by the [META LLAMA 3 COMMUNITY LICENSE AGREEMENT](LICENSE).
41
+
42
+ ### Method
43
+
44
+ Finetuned using an A100 on Google Colab.
45
+
46
+ [Fine-tune a Mistral-7b model with Direct Preference Optimization](https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac) - [Maxime Labonne](https://huggingface.co/mlabonne)
47
+
48
+ ### Configuration
49
+
50
+ LoRA, model, and training settings:
51
+
52
+ ```python
53
+ # LoRA configuration
54
+ peft_config = LoraConfig(
55
+ r=16,
56
+ lora_alpha=16,
57
+ lora_dropout=0.05,
58
+ bias="none",
59
+ task_type="CAUSAL_LM",
60
+ target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
61
+ )
62
+ # Model to fine-tune
63
+ model = AutoModelForCausalLM.from_pretrained(
64
+ model_name,
65
+ torch_dtype=torch.bfloat16,
66
+ load_in_4bit=True
67
+ )
68
+ model.config.use_cache = False
69
+ # Reference model
70
+ ref_model = AutoModelForCausalLM.from_pretrained(
71
+ model_name,
72
+ torch_dtype=torch.bfloat16,
73
+ load_in_4bit=True
74
+ )
75
+ # Training arguments
76
+ training_args = TrainingArguments(
77
+ per_device_train_batch_size=2,
78
+ gradient_accumulation_steps=2,
79
+ gradient_checkpointing=True,
80
+ learning_rate=3e-5,
81
+ lr_scheduler_type="cosine",
82
+ max_steps=420,
83
+ save_strategy="no",
84
+ logging_steps=1,
85
+ output_dir=new_model,
86
+ optim="paged_adamw_32bit",
87
+ warmup_steps=100,
88
+ bf16=True,
89
+ report_to="wandb",
90
+ )
91
+ # Create DPO trainer
92
+ dpo_trainer = DPOTrainer(
93
+ model,
94
+ ref_model,
95
+ args=training_args,
96
+ train_dataset=dataset,
97
+ tokenizer=tokenizer,
98
+ peft_config=peft_config,
99
+ beta=0.1,
100
+ force_use_ref_model=True
101
+ )
102
+ ```
ggml-model-Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:678a1680ed8048bda38a24259b230a837c669a81a56861ef5ccb96472b678773
3
+ size 4018917600
ggml-model-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9506414a6d0cf1f1c0e8a65742510bb2eedd090c4c3904731b0e309ab35e0cdc
3
+ size 4920733920
ggml-model-Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6affd7b9a43295ca2122c6853bfe5be127ab87b53db37ec390dfec70a5d73a9e
3
+ size 5732987104