sjbaek commited on
Commit
44ca3e1
โ€ข
1 Parent(s): 1224bc0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -1
README.md CHANGED
@@ -29,7 +29,60 @@ Gemma2 2b ํ•œ๊ตญ์–ด ๋ฐฉ์–ธ ํ†ต์—ญ๊ธฐ๋Š” ํ•œ๊ตญ์–ด ์‚ฌํˆฌ๋ฆฌ๋ฅผ ํ‘œ์ค€์–ด๋กœ ๋ฒˆ
29
 
30
  ## How to Get Started with the Model
31
 
32
- Use the code below to get started with the model.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
  ### Training Data
35
 
 
29
 
30
  ## How to Get Started with the Model
31
 
32
+ ```
33
+ import transformers
34
+ import torch
35
+
36
+ model_id = "sjbaek/gemma2-2b-it-korean-dialect"
37
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model_id, add_eos_token=True)
38
+
39
+ pipeline = transformers.pipeline(
40
+ "text-generation",
41
+ model=model_id,
42
+ tokenizer=tokenizer,
43
+ torch_dtype=torch.float16,
44
+ device_map="auto",
45
+ max_new_tokens = 512,
46
+ )
47
+
48
+
49
+ def dialect_to_standard(text, dialect_type):
50
+ return [
51
+ {
52
+ "role":"user",
53
+ "content": "Convert the following sentence or word which is {}'s dialect to standard Korean:\n\n{}".format(dialect_type, text)
54
+ }
55
+ ]
56
+
57
+
58
+ def standard_to_dialect(text, dialect_type):
59
+ return [
60
+ {
61
+ "role":"user",
62
+ "content": "Convert the following sentence or word which is standard Korean to {}'s dialect :\n\n{}".format(dialect_type, text)
63
+ }
64
+ ]
65
+
66
+ outputs = pipeline(
67
+ dialect_to_standard("์šฐ๋ฆฌ ๋™์ƒ๋„ ์š”๋ฒˆ์— ์›”์š”์ผ๋‚  ๋ฏธ๊นก ํƒ€์นด๋ถ€๋Œ„ ๋‚ด๋ ค์™”๋‹น ๋ชป ํƒ€๋‚œ", "์ œ์ฃผ๋„"),
68
+ do_sample=True,
69
+ temperature=0.1,
70
+ top_p=0.90,
71
+ add_special_tokens=True
72
+ )
73
+
74
+ print(outputs[0]["generated_text"][-1])
75
+
76
+ outputs = pipeline(
77
+ standard_to_dialect("๊ทธ๋Ÿฌ๋‹ˆ๊น ์ € ์–ด๋จธ๋‹ˆ ๋” ๋‚˜์ด ๋จน์–ด๊ฐ€๊ธฐ ์ „์— ์—ฌ๊ธฐ ์™€์•ผ ๋  ๊ฑด๋ฐ", "์ œ์ฃผ๋„"),
78
+ do_sample=True,
79
+ temperature=0.1,
80
+ top_p=0.90,
81
+ add_special_tokens=True
82
+ )
83
+
84
+ print(outputs[0]["generated_text"][-1])
85
+ ```
86
 
87
  ### Training Data
88