open-llama-7b-vi / README.md
nhanv's picture
Update README.md
3266fa7
metadata
license: apache-2.0
datasets:
  - vietgpt/wikipedia_vi
  - oscar-corpus/OSCAR-2301
language:
  - vi
  - en
pipeline_tag: text-generation

Concept of open-llama-7b-vi

This is a OpenLLama model finetuned on texts in the Vietnamese language.

Model architecture

The model architecture is the same as the original OpenLLama model

Training Data

The models are trained on the Vietnamese version of Wikipedia. The generated corpus files are 1.5GB in total, containing approximately 1.3M sentences.