torotoki commited on
Commit
393117d
1 Parent(s): 2c5c1ab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -34
README.md CHANGED
@@ -10,14 +10,12 @@ pipeline_tag: text-generation
10
  # PLaMo-13B
11
 
12
  ## Model Description
13
- PLaMo-13B is a LLaMA-based 13B model pre-trained on English and Japanese open datasets, developed by Preferred Networks, Inc.
14
- PLaMo-13B is released under Apache v2.0 license.
15
 
16
  [PLaMo-13B Release blog (Japanese)](https://tech.preferred.jp/ja/blog/llm-plamo/)
17
 
18
- ## Usage
19
-
20
- ### Requirements
21
 
22
  - numpy
23
  - safetensors
@@ -25,14 +23,7 @@ PLaMo-13B is released under Apache v2.0 license.
25
  - torch
26
  - transformers
27
 
28
- ### Use a pipeline as a high-level helper
29
- ```python
30
- import transformers
31
- pipeline = transformers.pipeline("text-generation", model="pfnet/plamo-13b", trust_remote_code=True)
32
- print(pipeline("The future of artificial intelligence technology is ", max_new_tokens=32))
33
- ```
34
-
35
- ### Load model directly
36
  ```python
37
  from transformers import AutoTokenizer, AutoModelForCausalLM
38
  tokenizer = AutoTokenizer.from_pretrained("pfnet/plamo-13b", trust_remote_code=True)
@@ -55,7 +46,8 @@ print(generated_text)
55
 
56
  - Model size: 13B
57
  - Trained tokens: 1.5T tokens (English: 1.32T tokens, Japanese: 0.18T tokens)
58
- - Context length: 4096
 
59
  - Developed by: Preferred Networkfs, Inc
60
  - Model type: Causal decoder-only
61
  - Language(s): English, Japanese
@@ -63,35 +55,26 @@ print(generated_text)
63
 
64
  ## Training Dataset
65
 
66
- ### English
67
-
68
- - C4 - English
69
- - Project Gutenberg
70
- - RedPajama - Arxiv
71
- - RedPajama - CommonCrawl - English
72
- - RedPajama - Github
73
- - RedPajama - StackExchange
74
- - RedPajama - Wikipedia
75
-
76
- ### Japanese
77
 
78
- - mC4 - Japanese
79
- - Wikipedia - Japanese
80
 
81
- ## Tokenizer
82
- PLaMo-13B uses sentencepiece tokenizer which is trained on a subset of the datasets for model pre-training.
83
 
84
  ## Bias, Risks, and Limitations
85
- PLaMo-13B is a new technology that carries risks with use. Testing conducted to date has been in English and Japanese, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, PLaMo-13B’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of PLaMo-13B, developers should perform safety testing and tuning tailored to their specific applications of the model.
86
 
87
  ## How to cite
88
  ```tex
89
- @online{PLaMo2023Introducing,
90
  author = {Preferred Networks, Inc},
91
- title = {PLaMo-13B},
92
  year = {2023},
93
- url = {https://huggingface.co/pfnet/plamo-13b},
94
- urldate = {2023-09-28}
95
  }
96
  ```
97
 
 
10
  # PLaMo-13B
11
 
12
  ## Model Description
13
+ PLaMo-13B-Instruct is an instruct fine-tuned model based on the 8192 context length version of [Plamo-13B](https://huggingface.co/pfnet/plamo-13b) text-generation model. PLaMo-13B-Instruct is fine-tuned using several publicly available datasets.
14
+ This model is released under Apache v2.0 license.
15
 
16
  [PLaMo-13B Release blog (Japanese)](https://tech.preferred.jp/ja/blog/llm-plamo/)
17
 
18
+ ## Requirements
 
 
19
 
20
  - numpy
21
  - safetensors
 
23
  - torch
24
  - transformers
25
 
26
+ ## Usage
 
 
 
 
 
 
 
27
  ```python
28
  from transformers import AutoTokenizer, AutoModelForCausalLM
29
  tokenizer = AutoTokenizer.from_pretrained("pfnet/plamo-13b", trust_remote_code=True)
 
46
 
47
  - Model size: 13B
48
  - Trained tokens: 1.5T tokens (English: 1.32T tokens, Japanese: 0.18T tokens)
49
+ - Tokenizer: sentencepiece tokenizer trained on a subset of the pretraining datasets.
50
+ - Context length: 8192
51
  - Developed by: Preferred Networkfs, Inc
52
  - Model type: Causal decoder-only
53
  - Language(s): English, Japanese
 
55
 
56
  ## Training Dataset
57
 
58
+ <!-- - [Stanford Alpaca (Japanese translation)](https://huggingface.co/datasets/fujiki/japanese_alpaca_data)-->
59
+ - [databricks-dolly-15k (Japanese translation)](https://huggingface.co/datasets/kunishou/databricks-dolly-15k-ja)
60
+ - [Anthropic HH-RLHF (Japanese translation, subset)](https://huggingface.co/datasets/fujiki/japanese_hh-rlhf-49k)
61
+ - [OpenAssistant Conversations Dataset (Japanese translation, oasst1)](https://huggingface.co/datasets/kunishou/oasst1-89k-ja)
62
+ - [Wikinews subset of Izumi-lab llm-japanese-dataset](https://huggingface.co/datasets/izumi-lab/llm-japanese-dataset)
 
 
 
 
 
 
63
 
64
+ For the pretraining model, see [Plamo-13B](https://huggingface.co/pfnet/plamo-13b).
 
65
 
 
 
66
 
67
  ## Bias, Risks, and Limitations
68
+ PLaMo-13B-Instruct is a new technology that carries risks with use. Testing conducted to date has been in English and Japanese, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, PLaMo-13B’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of PLaMo-13B, developers should perform safety testing and tuning tailored to their specific applications of the model.
69
 
70
  ## How to cite
71
  ```tex
72
+ @online{PLaMoInstruct2023Introducing,
73
  author = {Preferred Networks, Inc},
74
+ title = {PLaMo-13B-Instruct},
75
  year = {2023},
76
+ url = {https://huggingface.co/pfnet/plamo-13b-instruct},
77
+ urldate = {2023-10-26}
78
  }
79
  ```
80