File size: 3,031 Bytes
cf45560 7d6c3a1 cf45560 7d6c3a1 cf45560 7d6c3a1 0e88053 9743cea cf45560 f5efc68 cf45560 f5efc68 bf095ec 099c619 f5efc68 099c619 e4a835f 099c619 9743cea 9cf7bb1 9743cea 9cf7bb1 cf45560 bec84fc cf45560 7d6c3a1 cf45560 7d6c3a1 ded43e3 7d6c3a1 cf45560 7d6c3a1 cf45560 e20e52c 6b21661 76d2b0e 6b21661 f537643 6b21661 f537643 6b21661 f425bfb 6b21661 a68dbc9 cf45560 7d6c3a1 2862a8a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
---
language:
- jv
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
- sft
datasets:
- afrizalha/Gatra-2-Javanese
---
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Document Title</title>
<style>
h1 {
font-size: 36px;
color: navy;
font-family: 'Tahoma';
text-align: center;
}
</style>
</head>
<body>
<h1> Open models for indigenous Indonesian languages</h1>
</body>
</html>
<center>
<img src="https://imgur.com/PutckEK.png" alt="Bakpia" width="500" height="250">
<p><em>Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.</em></p>
<p><em style="color: black; font-weight: bold;">Beta preview</em></p>
</center>
Bakpia V1 is a family of Javanese language models. It is fine-tuned from available open models using massive synthetic data for Krama Javanese, where the prompts are generated by GPT-4o and the responses are generated by Claude 3 Haiku.
This repository contains the fp16 version of Bakpia V1 1.5B.
| Version | Base Model | URL | Training |
|---------|------------|-----|----------|
| V1 0.5B | Qwen 2 0.5B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-0.5B-Javanese/) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule|
| V1 1.5B | Qwen 2 1.5B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-1.5B-Javanese) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule|
| V1 9B | Gemma 2 9B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-fp16)/[4bit](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-4bit/) |Batch size = 16\*8, lr = 4e-5, linear schedule|
Training data is accessible [here](https://huggingface.co/datasets/afrizalha/Gatra-2-Javanese).
## Version 1.0
This is the first version of Bakpia.
✨ Training
- 36K input-output pairs
- 64/128 lora r/alpha
- Rank-stabilized lora
✨ Features
- Single-turn QA across various domains.
- Ngoko Javanese not currently supported.
## Generate with template
```
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
model.to("cuda")
template = """<|im_start|>system
<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""
input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?")
input = tokenizer([input], return_tensors = "pt").to("cuda")
outputs = model.generate(**input, max_new_tokens = 1024, streamer= TextStreamer(tokenizer), temperature=.5, use_cache=True, do_sample=True)
```
## Acknowledgments
- **Developed by:** Afrizal Hasbi Azizy
- **License:** Apache-2.0 |