boapps
/

szurkemarha-mistral

Text Generation

Inference Endpoints

Model card Files Files and versions Community

boapps commited on Feb 12

Commit

5eac576

•

1 Parent(s): bb9207a

Update README.md

Files changed (1) hide show

README.md +78 -0

README.md CHANGED Viewed

@@ -1,3 +1,81 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+datasets:
+- boapps/alpaca-hu
+- mlabonne/alpagasus
+language:
+- hu
+library_name: transformers
+pipeline_tag: text-generation
 ---
+# szürkemarha-mistral v1
+Ez az első (teszt) verziója egy magyar nyelvű instrukciókövető modellnek.
+## Használat
+Ebben a repoban van egy `app.py` script, ami egy gradio felületet csinál a kényelmesebb használathoz.
+Vagy kódból valahogy így:
+```python
+import torch
+from peft import PeftModel
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig
+tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
+BASE_MODEL = "mistralai/Mistral-7B-v0.1"
+LORA_WEIGHTS = "boapps/szurkemarha-mistral"
+device = "cuda"
+try:
+    if torch.backends.mps.is_available():
+        device = "mps"
+except:
+    pass
+nf4_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_use_double_quant=True,
+    bnb_4bit_compute_dtype=torch.bfloat16
+)
+model = AutoModelForCausalLM.from_pretrained(BASE_MODEL, quantization_config=nf4_config)
+model = PeftModel.from_pretrained(
+    model, LORA_WEIGHTS, torch_dtype=torch.float16, force_download=True
+)
+prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+Melyik megyében található az alábbi város?
+### Input:
+Pécs
+### Response:"""
+inputs = tokenizer(prompt, return_tensors="pt")
+input_ids = inputs["input_ids"].to(device)
+generation_config = GenerationConfig(
+    temperature=0.1,
+    top_p=0.75,
+    top_k=40,
+    num_beams=4,
+)
+with torch.no_grad():
+    generation_output = model.generate(
+        input_ids=input_ids,
+        generation_config=generation_config,
+        return_dict_in_generate=True,
+        output_scores=True,
+        max_new_tokens=256,
+    )
+s = generation_output.sequences[0]
+output = tokenizer.decode(s)
+print(output.split("### Response:")[1].strip())
+```