sjbaek
/

gemma2-2b-it-korean-dialect

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sjbaek commited on Sep 22

Commit

44ca3e1

•

1 Parent(s): 1224bc0

Update README.md

Files changed (1) hide show

README.md +54 -1

README.md CHANGED Viewed

@@ -29,7 +29,60 @@ Gemma2 2b 한국어 방언 통역기는 한국어 사투리를 표준어로 번
 ## How to Get Started with the Model
-Use the code below to get started with the model.
 ### Training Data

 ## How to Get Started with the Model
+```
+import transformers
+import torch
+model_id = "sjbaek/gemma2-2b-it-korean-dialect"
+tokenizer = transformers.AutoTokenizer.from_pretrained(model_id, add_eos_token=True)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model_id,
+    tokenizer=tokenizer,
+    torch_dtype=torch.float16,
+    device_map="auto",
+    max_new_tokens = 512,
+)
+def dialect_to_standard(text, dialect_type):
+        return [
+            {
+                "role":"user",
+                "content": "Convert the following sentence or word which is {}'s dialect to standard Korean:\n\n{}".format(dialect_type, text)
+            }
+        ]
+def standard_to_dialect(text, dialect_type):
+        return [
+            {
+                "role":"user",
+                "content": "Convert the following sentence or word which is standard Korean to {}'s dialect :\n\n{}".format(dialect_type, text)
+            }
+        ]
+outputs = pipeline(
+    dialect_to_standard("우리 동생도 요번에 월요일날 미깡 타카부댄 내려왔당 못 타난", "제주도"),
+    do_sample=True,
+    temperature=0.1,
+    top_p=0.90,
+    add_special_tokens=True
+)
+print(outputs[0]["generated_text"][-1])
+outputs = pipeline(
+    standard_to_dialect("그러니깐 저 어머니 더 나이 먹어가기 전에 여기 와야 될 건데", "제주도"),
+    do_sample=True,
+    temperature=0.1,
+    top_p=0.90,
+    add_special_tokens=True
+)
+print(outputs[0]["generated_text"][-1])
+```
 ### Training Data