LLaMa2 - 7B Chat models, extend vocab size to 44800 for Vietnamese understanding.
Continual Pre-Train with 2B Vietnames Tokens aligned from VnNews Corpus, 10K vnthuquan books, wikipedia_vi
Fine-Tuning with infCapital/viet-llama2-ft-tiny dataset, the combination of vaious dataset then translated into Vietnamese using OpenAI GPT-3
For more information: email me at [email protected] | http://fb.com/hungbui2013

Downloads last month: 1,434

Inference Examples

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

infCapital
/

viet-llama2-ft

Datasets used to train infCapital/viet-llama2-ft