Edit model card

LLaMA Chinese 81M

一個小型中英文(雙語)預訓練語言模型。

Training Dataset

中文維基百科(20230601)
英文維基百科(20230601)

Tokenizer

使用重新在中英文語料上訓練的 BPE Tokenizer，擁有較佳的分詞效果與邊解碼效率。

https://github.com/p208p2002/BPE-tokenizer-from-zh-wiki

Downloads last month: 963

Safetensors

Model size

81M params

Tensor type

F32

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train p208p2002/llama-chinese-81M

Collection including p208p2002/llama-chinese-81M

LLaMA-zhtw

Collection

6 items • Updated Jun 11