|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- boapps/alpaca-hu |
|
- mlabonne/alpagasus |
|
language: |
|
- hu |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# szürkemarha-mistral v1 |
|
|
|
Ez az első (teszt) verziója egy magyar nyelvű instrukciókövető modellnek. |
|
|
|
<img src="szurkemarha_logo.png" width="400"> |
|
|
|
## Használat |
|
|
|
Ebben a repoban van egy `app.py` script, ami egy gradio felületet csinál a kényelmesebb használathoz. |
|
|
|
Vagy kódból valahogy így: |
|
|
|
```python |
|
import torch |
|
from peft import PeftModel |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1") |
|
|
|
BASE_MODEL = "mistralai/Mistral-7B-v0.1" |
|
LORA_WEIGHTS = "boapps/szurkemarha-mistral" |
|
|
|
device = "cuda" |
|
|
|
try: |
|
if torch.backends.mps.is_available(): |
|
device = "mps" |
|
except: |
|
pass |
|
|
|
nf4_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
|
|
model = AutoModelForCausalLM.from_pretrained(BASE_MODEL, quantization_config=nf4_config) |
|
|
|
model = PeftModel.from_pretrained( |
|
model, LORA_WEIGHTS, torch_dtype=torch.float16, force_download=True |
|
) |
|
|
|
prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. |
|
|
|
### Instruction: |
|
Melyik megyében található az alábbi város? |
|
|
|
### Input: |
|
Pécs |
|
|
|
### Response:""" |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
input_ids = inputs["input_ids"].to(device) |
|
generation_config = GenerationConfig( |
|
temperature=0.1, |
|
top_p=0.75, |
|
top_k=40, |
|
num_beams=4, |
|
) |
|
with torch.no_grad(): |
|
generation_output = model.generate( |
|
input_ids=input_ids, |
|
generation_config=generation_config, |
|
return_dict_in_generate=True, |
|
output_scores=True, |
|
max_new_tokens=256, |
|
) |
|
s = generation_output.sequences[0] |
|
output = tokenizer.decode(s) |
|
print(output.split("### Response:")[1].strip()) |
|
``` |