skkjodhpur commited on
Commit
69cf9b1
1 Parent(s): e390b36

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -57
README.md CHANGED
@@ -20,63 +20,67 @@ tags:
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
 
23
- # How to Get Started with the Model
24
- Use the code below to get started with the model:
25
-
26
- Python Code:
27
-
28
- %%capture
29
- # Installs Unsloth, Xformers (Flash Attention) and all other packages!
30
- !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
31
- !pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
32
-
33
- from unsloth import FastLanguageModel
34
- import torch
35
-
36
- # Define the dtype you want to use
37
- dtype = torch.float16 # Example: using float16 for lower memory usage
38
-
39
- # Set load_in_4bit to True or False depending on your requirements
40
- load_in_4bit = True # Or False if you don't want to load in 4-bit
41
-
42
- # Verify the model name is correct and exists on Hugging Face Model Hub
43
- model_name = "skkjodhpur/Meta-Llama-3.1-8B-Unsloth-2x-faster-finetuning-GGUF-by-skk"
44
- # Check if the model exists, if not, you may need to adjust the model name
45
- !curl -s https://huggingface.co/{model_name}/resolve/main/config.json | jq .
46
-
47
- model, tokenizer = FastLanguageModel.from_pretrained(
48
- model_name = model_name,
49
- max_seq_length = 2048,
50
- dtype = dtype,
51
- load_in_4bit = load_in_4bit,
52
- )
53
- FastLanguageModel.for_inference(model) # Enable native 2x faster inference
54
-
55
- # prompt = You MUST copy from above!
56
-
57
- prompt = """Below is an tools that describes a task, paired with an query that provides further context. Write a answers that appropriately completes the request.
58
-
59
- ### tools:
60
- {}
61
-
62
- ### query:
63
- {}
64
-
65
- ### answers:
66
- {}"""
67
-
68
- inputs = tokenizer(
69
- [
70
- prompt.format(
71
- '[{"name": "live_giveaways_by_type", "description": "Retrieve live giveaways from the GamerPower API based on the specified type.", "parameters": {"type": {"description": "The type of giveaways to retrieve (e.g., game, loot, beta).", "type": "str", "default": "game"}}}]', # instruction
72
- "Where can I find live giveaways for beta access and games?", # input
73
- "", # output - leave this blank for generation!
74
- )
75
- ], return_tensors = "pt").to("cuda")
76
-
77
- from transformers import TextStreamer
78
- text_streamer = TextStreamer(tokenizer)
79
- _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
 
 
 
 
80
 
81
 
82
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
 
23
+ # Meta-Llama-3.1-8B-Unsloth-2x-faster-finetuning-GGUF-by-skk
24
+
25
+ This repository contains the Meta-Llama-3.1-8B-Unsloth-2x-faster-finetuning-GGUF model, optimized for faster inference.
26
+
27
+ ## Getting Started
28
+
29
+ Use the following Python code to get started with the model:
30
+
31
+ ```python
32
+ %%capture
33
+ # Installs Unsloth, Xformers (Flash Attention) and all other packages!
34
+ !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
35
+ !pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
36
+
37
+ from unsloth import FastLanguageModel
38
+ import torch
39
+
40
+ # Define the dtype you want to use
41
+ dtype = torch.float16 # Example: using float16 for lower memory usage
42
+
43
+ # Set load_in_4bit to True or False depending on your requirements
44
+ load_in_4bit = True # Or False if you don't want to load in 4-bit
45
+
46
+ # Verify the model name is correct and exists on Hugging Face Model Hub
47
+ model_name = "skkjodhpur/Meta-Llama-3.1-8B-Unsloth-2x-faster-finetuning-GGUF-by-skk"
48
+ # Check if the model exists, if not, you may need to adjust the model name
49
+ !curl -s https://huggingface.co/{model_name}/resolve/main/config.json | jq .
50
+
51
+ model, tokenizer = FastLanguageModel.from_pretrained(
52
+ model_name = model_name,
53
+ max_seq_length = 2048,
54
+ dtype = dtype,
55
+ load_in_4bit = load_in_4bit,
56
+ )
57
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
58
+
59
+ # prompt = You MUST copy from above!
60
+
61
+ prompt = """Below is an tools that describes a task, paired with an query that provides further context. Write a answers that appropriately completes the request.
62
+
63
+ ### tools:
64
+ {}
65
+
66
+ ### query:
67
+ {}
68
+
69
+ ### answers:
70
+ {}"""
71
+
72
+ inputs = tokenizer(
73
+ [
74
+ prompt.format(
75
+ '[{"name": "live_giveaways_by_type", "description": "Retrieve live giveaways from the GamerPower API based on the specified type.", "parameters": {"type": {"description": "The type of giveaways to retrieve (e.g., game, loot, beta).", "type": "str", "default": "game"}}}]', # instruction
76
+ "Where can I find live giveaways for beta access and games?", # input
77
+ "", # output - leave this blank for generation!
78
+ )
79
+ ], return_tensors = "pt").to("cuda")
80
+
81
+ from transformers import TextStreamer
82
+ text_streamer = TextStreamer(tokenizer)
83
+ _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
84
 
85
 
86
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)