HenryHHHH commited on
Commit
e7e8fb7
1 Parent(s): 3d56cd2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -25,7 +25,9 @@ base_model: meta-llama/LLaMA-2-7B
25
 
26
  ### Overview
27
 
28
- This model is a distilled version of LLaMA 2, containing approximately 80 million parameters. It was trained using a mix of OpenWebText and WikiText Raw V1 datasets. Knowledge distillation was employed to transfer knowledge from a larger "teacher" model—Meta’s 7B LLaMA 2—to help this smaller model mimic the behavior of the teacher.
 
 
29
  This version is the latest version of DistilLlama, which has gone through 5 days of training using two Nvidia A100 80G GPU.
30
 
31
  ### Model Architecture
@@ -98,4 +100,4 @@ The architecture is based on LLaMA 2, with the following parameters:
98
  url={https://arxiv.org/abs/2308.02019},
99
  }
100
 
101
- *Note: The repository will be updated as training progresses. Last update 2024-10-23*
 
25
 
26
  ### Overview
27
 
28
+ This model is a distilled version of LLaMA 2, containing approximately 80 million parameters.
29
+ It was trained using a mix of OpenWebText and WikiText Raw V1 datasets.
30
+ Knowledge distillation was employed to transfer knowledge from a larger "teacher" model—Meta’s 7B LLaMA 2—to help this smaller model mimic the behavior of the teacher.
31
  This version is the latest version of DistilLlama, which has gone through 5 days of training using two Nvidia A100 80G GPU.
32
 
33
  ### Model Architecture
 
100
  url={https://arxiv.org/abs/2308.02019},
101
  }
102
 
103
+ *Note: The repository will be updated as training progresses. Last update 2024-11-01*