Edit model card

Quyen

Quyen

Model Description

Quyen is our first flagship LLM series based on the Qwen1.5 family. We introduced 6 different versions:

  • Quyen-SE (0.5B)
  • Quyen-Mini (1.8B)
  • Quyen (4B)
  • Quyen-Plus (7B)
  • Quyen-Pro (14B)
  • Quyen-Pro-Max (72B)

All models were trained with SFT and DPO using the following dataset:

  • OpenHermes-2.5 by Teknium
  • Capyabara by LDJ
  • argilla/distilabel-capybara-dpo-7k-binarized by argilla
  • orca_dpo_pairs by Intel
  • and Private Data by Ontocord & BEE-spoke-data

Prompt Template

  • All Quyen models use ChatML as the default template:
<|im_start|>system
You are a sentient, superintelligent artificial general intelligence, here to teach and assist me.<|im_end|>
<|im_start|>user
Hello world.<|im_end|>
<|im_start|>assistant
  • You can also use apply_chat_template:
messages = [
    {"role": "system", "content": "You are a sentient, superintelligent artificial general intelligence, here to teach and assist me."},
    {"role": "user", "content": "Hello world."}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)

Benchmarks:

  • Coming Soon! We will update the benchmarks later

Acknowledgement

  • We're incredibly grateful to Tensoic and Ontocord for their generous support with compute and data preparation.
  • Special thanks to the Qwen team for letting us access the models early for these amazing finetunes.
Downloads last month
76
Safetensors
Model size
3.95B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train vilm/Quyen-v0.1

Collection including vilm/Quyen-v0.1