K00B404
/

CodeLlama-7B-Instruct-bf16-sharded-ft-v0_01

PEFT

Model card Files Files and versions Community

K00B404 commited on Sep 18, 2023

Commit

df08d2b

•

1 Parent(s): a0db1c5

Update README.md

Browse files

tuned and updated some stuff in the finetune notebook

Files changed (1) hide show

README.md +5 -12

README.md CHANGED Viewed

@@ -20,13 +20,9 @@ The following `bitsandbytes` quantization config was used during training:
 - bnb_4bit_compute_dtype: float16
 ### Framework versions
 - PEFT 0.6.0.dev0
-- # -*- coding: utf-8 -*-
-"""bf16_sharded_Fine_Tuning_using_QLora(1).ipynb
-Automatically generated by Colaboratory.
 Original file is located at
     https://colab.research.google.com/drive/1yH0ov1ZDpun6yGi19zE07jkF_EUMI1Bf
@@ -63,6 +59,7 @@ wandb_key=["<API_KEY>"]
 wandb.init(project="<project_name>",
           name="<name>"
 )
 # login with API
 from huggingface_hub import login
 login()
@@ -127,11 +124,11 @@ output_dir = "./results"
 per_device_train_batch_size = 4
 gradient_accumulation_steps = 4
 optim = "paged_adamw_32bit"
-save_steps = 10
-logging_steps = 11
 learning_rate = 2e-4
 max_grad_norm = 0.3
-max_steps = 10
 warmup_ratio = 0.03
 lr_scheduler_type = "constant"
@@ -175,16 +172,12 @@ for name, module in trainer.model.named_modules():
 """## Train the model
 You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
 Now let's train the model! Simply call `trainer.train()`
 """
 trainer.train()
 """During training, the model should converge nicely as follows:
-![image](https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/loss-falcon-7b.png)
 The `SFTTrainer` also takes care of properly saving only the adapters during training instead of saving the entire model.
 """

 - bnb_4bit_compute_dtype: float16
 ### Framework versions
 - PEFT 0.6.0.dev0
+"""
 Original file is located at
     https://colab.research.google.com/drive/1yH0ov1ZDpun6yGi19zE07jkF_EUMI1Bf
 wandb.init(project="<project_name>",
           name="<name>"
 )
 # login with API
 from huggingface_hub import login
 login()
 per_device_train_batch_size = 4
 gradient_accumulation_steps = 4
 optim = "paged_adamw_32bit"
+save_steps = 100
+logging_steps = 10
 learning_rate = 2e-4
 max_grad_norm = 0.3
+max_steps = 100
 warmup_ratio = 0.03
 lr_scheduler_type = "constant"
 """## Train the model
 You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
 Now let's train the model! Simply call `trainer.train()`
 """
 trainer.train()
 """During training, the model should converge nicely as follows:
 The `SFTTrainer` also takes care of properly saving only the adapters during training instead of saving the entire model.
 """