Japanese
llama
causal-lm
retarfi commited on
Commit
dd20d2e
1 Parent(s): e988e5b

Add model weight

Browse files
Files changed (3) hide show
  1. README.md +32 -5
  2. adapter_config.json +17 -0
  3. adapter_model.bin +3 -0
README.md CHANGED
@@ -1,10 +1,37 @@
1
  ---
2
  license: mit
3
  datasets:
4
- - izumi-lab/llm-japanese-dataset
5
  language:
6
- - ja
7
  tags:
8
- - llama
9
- - causal-lm
10
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  datasets:
4
+ - izumi-lab/llm-japanese-dataset
5
  language:
6
+ - ja
7
  tags:
8
+ - llama
9
+ - causal-lm
10
+ ---
11
+
12
+ This repo contains a low-rank adapter for LLaMA-13b
13
+ fit on the [llm-japanese-dataset](https://github.com/masanorihirano/llm-japanese-dataset) dataset.
14
+
15
+ This version of the weights was trained with the following hyperparameters:
16
+
17
+ - Epochs: 1
18
+ - Batch size: 130
19
+ - Cutoff length: 256
20
+ - Learning rate: 3e-4
21
+ - Lora _r_: 4
22
+ - Lora target modules: q_proj, v_proj
23
+
24
+ ```python
25
+ import torch
26
+ from transformers import LlamaForCausalLM, LlamaTokenizer
27
+ from peft import PeftModel
28
+
29
+ base_model = "decapoda-research/llama-13b-hf"
30
+ model = LlamaForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)
31
+ tokenizer = LlamaTokenizer.from_pretrained(base_model)
32
+ model = PeftModel.from_pretrained(
33
+ model,
34
+ "izumi-lab/llama-13b-japanese-lora-v0",
35
+ torch_dtype=torch.float16,
36
+ )
37
+ ```
adapter_config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "base_model_name_or_path": "decapoda-research/llama-13b-hf",
3
+ "bias": "none",
4
+ "fan_in_fan_out": false,
5
+ "inference_mode": true,
6
+ "init_lora_weights": true,
7
+ "lora_alpha": 16,
8
+ "lora_dropout": 0.05,
9
+ "modules_to_save": null,
10
+ "peft_type": "LORA",
11
+ "r": 4,
12
+ "target_modules": [
13
+ "q_proj",
14
+ "v_proj"
15
+ ],
16
+ "task_type": "CAUSAL_LM"
17
+ }
adapter_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:84ebf795b240561cea674046dd1f0861072755b8fdecd9a7896706ac0986cbbc
3
+ size 13164557