Goekdeniz-Guelmez commited on
Commit
8e2b878
1 Parent(s): 9b36d85

Upload folder using huggingface_hub (#1)

Browse files

- 18f4ca84a129dbd897465c0ce564dc91e98882f75d41d9d2be6223fa74ccd77b (be8de14767ac0b965d47fad54fbf437ed318936f)
- 6e11252276cc7b46fa2ed78b49a03d478c69276ee32460ed1718e0c8643554aa (4dbae4371df7e92347aa5710ed93ca1cd0ca3edf)

Files changed (5) hide show
  1. README.md +39 -0
  2. config.json +23 -0
  3. model.pth +3 -0
  4. tokenizer.json +0 -0
  5. tokenizer.model +3 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Goekdeniz-Guelmez/j.o.s.i.e.v4o-7b-orpo-stage1-v1
3
+ language:
4
+ - en
5
+ license: apache-2.0
6
+ tags:
7
+ - text-generation-inference
8
+ - transformers
9
+ - unsloth
10
+ - qwen2
11
+ - trl
12
+ - orpo
13
+ - KANama
14
+ ---
15
+
16
+ # Goekdeniz-Guelmez/KANama-fineweb-v2-test1
17
+
18
+ The Model [Goekdeniz-Guelmez/KANama-fineweb-v2-test1](https://huggingface.co/Goekdeniz-Guelmez/KANama-fineweb-v2-test1) was created using KANama.
19
+
20
+ ## Use with KANama
21
+
22
+ ```bash
23
+ pip install KANama, transformers
24
+ ```
25
+
26
+ ```python
27
+ from model.handler import from_pretrained, quick_inference
28
+ from transformers import AutoTokenizer
29
+
30
+ tokenizer = AutoTokenizer.from_pretrained("Doctor-Shotgun/TinyLlama-1.1B-32k")
31
+ model = from_pretrained("path/to/model/folder")
32
+
33
+ prompt="hello"
34
+
35
+ input_tokens = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
36
+
37
+ generated_tokens, generated_text = quick_inference(model, input_tokens, max_new_tokens=50, tokenizer=tokenizer)
38
+ print(generated_text)
39
+ ```
config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "vocab_size": 152064,
3
+ "pad_id": 151645,
4
+ "eos_id": -1,
5
+ "dim": 256,
6
+ "n_layers": 18,
7
+ "n_heads": 12,
8
+ "n_kv_heads": 6,
9
+ "use_kan": true,
10
+ "train_softmax_temp": true,
11
+ "use_softmax_temp_proj": true,
12
+ "softmax_bias": false,
13
+ "multiple_of": 256,
14
+ "ffn_dim_multiplier": null,
15
+ "rms_norm_eps": 1e-05,
16
+ "rope_theta": 500000,
17
+ "use_scaled_rope": false,
18
+ "max_batch_size": 100,
19
+ "max_seq_len": 128,
20
+ "num_experts": 14,
21
+ "num_experts_per_tok": 4,
22
+ "model_type": "KANaMoEv2"
23
+ }
model.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:913390de7a2d013f6afb37d49f0e3b2e84707b2369a6c65f78d4d38ad8a6f2e9
3
+ size 8272589395
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a8506e7111b80c6d8635951a02eab0f4e1a8e4e5772da83846579e97b16f61bf
3
+ size 7031673