File size: 3,031 Bytes

cf45560
7d6c3a1
 
 
cf45560
7d6c3a1
 
cf45560
7d6c3a1
0e88053
 
9743cea
 
cf45560
f5efc68
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cf45560
f5efc68
bf095ec
099c619
f5efc68
 
099c619
 
e4a835f
099c619
9743cea
 
9cf7bb1
 
 
9743cea
9cf7bb1
cf45560
bec84fc
cf45560
7d6c3a1
cf45560
7d6c3a1
 
ded43e3
7d6c3a1
cf45560
7d6c3a1
 
 
cf45560
e20e52c
6b21661
76d2b0e
6b21661
 
 
f537643
6b21661
 
 
 
 
 
 
 
f537643
6b21661
f425bfb
6b21661
 
a68dbc9
cf45560
7d6c3a1
2862a8a

---
language:
- jv
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
- sft
datasets:
- afrizalha/Gatra-2-Javanese
---
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document Title</title>
    <style>
        h1 {
            font-size: 36px;
            color: navy;
            font-family: 'Tahoma';
            text-align: center;
        }
    </style>
</head>
<body>
    <h1> Open models for indigenous Indonesian languages</h1>
</body>
</html>

<center>
    <img src="https://imgur.com/PutckEK.png" alt="Bakpia" width="500" height="250">
    <p><em>Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.</em></p>
    <p><em style="color: black; font-weight: bold;">Beta preview</em></p>
</center>
Bakpia V1 is a family of Javanese language models. It is fine-tuned from available open models using massive synthetic data for Krama Javanese, where the prompts are generated by GPT-4o and the responses are generated by Claude 3 Haiku.

This repository contains the fp16 version of Bakpia V1 1.5B.

| Version | Base Model | URL | Training |
|---------|------------|-----|----------|
| V1 0.5B | Qwen 2 0.5B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-0.5B-Javanese/) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule|
| V1 1.5B | Qwen 2 1.5B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-1.5B-Javanese) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule|
| V1 9B | Gemma 2 9B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-fp16)/[4bit](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-4bit/) |Batch size = 16\*8, lr = 4e-5, linear schedule|

Training data is accessible [here](https://huggingface.co/datasets/afrizalha/Gatra-2-Javanese).

## Version 1.0

This is the first version of Bakpia.

✨ Training
- 36K input-output pairs
- 64/128 lora r/alpha
- Rank-stabilized lora

✨ Features
- Single-turn QA across various domains.
- Ngoko Javanese not currently supported.

## Generate with template
```
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer

tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
model.to("cuda")

template = """<|im_start|>system
<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""

input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?")
input = tokenizer([input], return_tensors = "pt").to("cuda")
outputs = model.generate(**input, max_new_tokens = 1024, streamer= TextStreamer(tokenizer), temperature=.5, use_cache=True, do_sample=True)
```

## Acknowledgments

- **Developed by:** Afrizal Hasbi Azizy
- **License:** Apache-2.0