PEFT
K00B404 commited on
Commit
df08d2b
1 Parent(s): a0db1c5

Update README.md

Browse files

tuned and updated some stuff in the finetune notebook

Files changed (1) hide show
  1. README.md +5 -12
README.md CHANGED
@@ -20,13 +20,9 @@ The following `bitsandbytes` quantization config was used during training:
20
  - bnb_4bit_compute_dtype: float16
21
  ### Framework versions
22
 
23
-
24
  - PEFT 0.6.0.dev0
25
 
26
- - # -*- coding: utf-8 -*-
27
- """bf16_sharded_Fine_Tuning_using_QLora(1).ipynb
28
-
29
- Automatically generated by Colaboratory.
30
 
31
  Original file is located at
32
  https://colab.research.google.com/drive/1yH0ov1ZDpun6yGi19zE07jkF_EUMI1Bf
@@ -63,6 +59,7 @@ wandb_key=["<API_KEY>"]
63
  wandb.init(project="<project_name>",
64
  name="<name>"
65
  )
 
66
  # login with API
67
  from huggingface_hub import login
68
  login()
@@ -127,11 +124,11 @@ output_dir = "./results"
127
  per_device_train_batch_size = 4
128
  gradient_accumulation_steps = 4
129
  optim = "paged_adamw_32bit"
130
- save_steps = 10
131
- logging_steps = 11
132
  learning_rate = 2e-4
133
  max_grad_norm = 0.3
134
- max_steps = 10
135
  warmup_ratio = 0.03
136
  lr_scheduler_type = "constant"
137
 
@@ -175,16 +172,12 @@ for name, module in trainer.model.named_modules():
175
 
176
  """## Train the model
177
  You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
178
-
179
  Now let's train the model! Simply call `trainer.train()`
180
  """
181
 
182
  trainer.train()
183
 
184
  """During training, the model should converge nicely as follows:
185
-
186
- ![image](https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/loss-falcon-7b.png)
187
-
188
  The `SFTTrainer` also takes care of properly saving only the adapters during training instead of saving the entire model.
189
  """
190
 
 
20
  - bnb_4bit_compute_dtype: float16
21
  ### Framework versions
22
 
 
23
  - PEFT 0.6.0.dev0
24
 
25
+ """
 
 
 
26
 
27
  Original file is located at
28
  https://colab.research.google.com/drive/1yH0ov1ZDpun6yGi19zE07jkF_EUMI1Bf
 
59
  wandb.init(project="<project_name>",
60
  name="<name>"
61
  )
62
+
63
  # login with API
64
  from huggingface_hub import login
65
  login()
 
124
  per_device_train_batch_size = 4
125
  gradient_accumulation_steps = 4
126
  optim = "paged_adamw_32bit"
127
+ save_steps = 100
128
+ logging_steps = 10
129
  learning_rate = 2e-4
130
  max_grad_norm = 0.3
131
+ max_steps = 100
132
  warmup_ratio = 0.03
133
  lr_scheduler_type = "constant"
134
 
 
172
 
173
  """## Train the model
174
  You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
 
175
  Now let's train the model! Simply call `trainer.train()`
176
  """
177
 
178
  trainer.train()
179
 
180
  """During training, the model should converge nicely as follows:
 
 
 
181
  The `SFTTrainer` also takes care of properly saving only the adapters during training instead of saving the entire model.
182
  """
183