nebchi commited on
Commit
97c3a95
β€’
1 Parent(s): 59deb3e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md CHANGED
@@ -1,7 +1,87 @@
1
  ---
 
 
 
2
  license: llama3
 
 
 
3
  ---
4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
  Results in [LogicKor](https://github.com/StableFluffy/LogicKor)* are as follows:
7
 
 
1
  ---
2
+ library_name: transformers
3
+ tags:
4
+ - pytorch
5
  license: llama3
6
+ language:
7
+ - ko
8
+ pipeline_tag: text-generation
9
  ---
10
 
11
+ <p align="left">
12
+ <img src="https://huggingface.co/cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0/resolve/main/ocelot.webp" width="50%"/>
13
+ <p>
14
+
15
+ # solar-kor-resume
16
+ > Update @ 2024.06.05: First release of Llama3-Ocelot-8B-instruct-v01
17
+ <!-- Provide a quick summary of what the model is/does. -->
18
+
19
+ This model card corresponds to the 10.8B Instruct version of the **Llama-Ko** model.
20
+
21
+ The train wad done on A100-80GB
22
+
23
+ **Resources and Technical Documentation**:
24
+ * [llama Model](beomi/Llama-3-Open-Ko-8B)
25
+ - [Orca-Math](https://huggingface.co/datasets/kuotient/orca-math-korean-dpo-pairs)
26
+ - [ko_Ultrafeedback_binarized](maywell/ko_Ultrafeedback_binarized)
27
+
28
+ **Citation**
29
+
30
+ **Model Developers**: frcp, nebchi, pepperonipizza97
31
+
32
+ ## Model Information
33
+ It is an LLM model capable of generating Korean text, trained on a pre-trained base model with high-quality Korean SFT dataset and DPO dataset.
34
+
35
+ #### *Inputs and outputs*
36
+
37
+ - **Input:** Text string, such as a question, a prompt, or a document to be summarized.
38
+ - **Output:** Generated Korean-language text in response to the input, such as an answer to a question, or a summary of a document.
39
+
40
+ #### Running the model on a single / multi GPU
41
+ ```python
42
+ from transformers import AutoTokenizer, AutoModelForCausalLM
43
+
44
+ tokenizer = AutoTokenizer.from_pretrained("cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0")
45
+ model = AutoModelForCausalLM.from_pretrained("cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0", device_map="auto")
46
+
47
+ pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=4096, streamer=streamer)
48
+
49
+ text = 'λŒ€ν•œλ―Όκ΅­μ˜ μˆ˜λ„λŠ” μ–΄λ””μΈκ°€μš”?'
50
+
51
+ messages = [
52
+ {
53
+ "role": "user",
54
+ "content": "{}".format(text)
55
+ }
56
+ ]
57
+
58
+ prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
59
+
60
+ outputs = pipe(
61
+ prompt,
62
+ temperature=0.2,
63
+ add_special_tokens=True
64
+ )
65
+ print(outputs[0]["generated_text"][len(prompt):])
66
+
67
+ ```
68
+ ### results
69
+
70
+ ```python
71
+ λŒ€ν•œλ―Όκ΅­μ˜ μˆ˜λ„λŠ” μ„œμšΈνŠΉλ³„μ‹œμž…λ‹ˆλ‹€.
72
+ μ„œμšΈνŠΉλ³„μ‹œμ—λŠ” μ²­μ™€λŒ€, κ΅­νšŒμ˜μ‚¬λ‹Ή, λŒ€λ²•μ› λ“± λŒ€ν•œλ―Όκ΅­μ˜ μ£Όμš” 정뢀기관이 μœ„μΉ˜ν•΄ μžˆμŠ΅λ‹ˆλ‹€.
73
+ λ˜ν•œ μ„œμšΈμ‹œλŠ” λŒ€ν•œλ―Όκ΅­μ˜ 경제, λ¬Έν™”, ꡐ윑, κ΅ν†΅μ˜ μ€‘μ‹¬μ§€λ‘œμ¨ λŒ€ν•œλ―Όκ΅­μ˜ μˆ˜λ„μ΄μž λŒ€ν‘œ λ„μ‹œμž…λ‹ˆλ‹€.μ œκ°€ 도움이 λ˜μ—ˆκΈΈ λ°”λžλ‹ˆλ‹€. 더 κΆκΈˆν•œ 점이 μžˆμœΌμ‹œλ©΄ μ–Έμ œλ“ μ§€ λ¬Όμ–΄λ³΄μ„Έμš”!
74
+ ```
75
+
76
+ ```bibtex
77
+ @misc {cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0,
78
+ author = { {frcp, nebchi, pepperonipizza97} },
79
+ title = { solar-kor-resume},
80
+ year = 2024,
81
+ url = { https://huggingface.co/cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0 },
82
+ publisher = { Hugging Face }
83
+ }
84
+ ```
85
 
86
  Results in [LogicKor](https://github.com/StableFluffy/LogicKor)* are as follows:
87