Edit model card

Open models for indigenous Indonesian languages

Bakpia

Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.

Beta preview

Bakpia V1 is a family of Javanese language models. It is fine-tuned from available open models using massive synthetic data for Krama Javanese, where the prompts are generated by GPT-4o and the responses are generated by Claude 3 Haiku.

This repository contains the 4bit version of Bakpia V1 9B.

Version Base Model URL Training
V1 0.5B Qwen 2 0.5B Instruct fp16 Epoch = 1, Batch = 16*8, lr = 5e-5, linear schedule
V1 1.5B Qwen 2 1.5B Instruct fp16 Epoch = 1, Batch = 16*8, lr = 5e-5, linear schedule
V1 9B Gemma 2 9B Instruct fp16/4bit Batch size = 16*8, lr = 4e-5, linear schedule

Training data is accessible here.

Version 1.0

This is the first version of Bakpia.

✨ Training

  • 36K input-output pairs
  • 64/128 lora r/alpha
  • Rank-stabilized lora

✨ Features

  • Single-turn QA across various domains.
  • Ngoko Javanese not currently supported.

Generate with template

# Update transformers for Gemma 2 compatibility + install accelerate and bitsandbytes for loading 4bit model 
!pip install -q git+https://github.com/huggingface/transformers.git
!pip install -q accelerate bitsandbytes

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-9B-Javanese-4bit")
model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-9B-Javanese-4bit", quantization_config=BitsAndBytesConfig(load_in_4bit=True))
model.to("cuda")

template = """<start_of_turn>user
{prompt}<end_of_turn>
<start_of_turn>model
"""

input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?")
input = tokenizer([input], return_tensors = "pt").to("cuda")
outputs = model.generate(**input, max_new_tokens = 1024, streamer= TextStreamer(tokenizer), temperature=.5, use_cache=True, do_sample=True)

Acknowledgments

  • Developed by: Afrizal Hasbi Azizy
  • License: Apache-2.0
Downloads last month
7
Safetensors
Model size
5.21B params
Tensor type
F32
·
FP16
·
U8
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train afrizalha/Bakpia-V1-9B-Javanese-4bit

Collection including afrizalha/Bakpia-V1-9B-Javanese-4bit