OpenNLPLab commited on
Commit
9600986
β€’
1 Parent(s): 60b0f0a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -3
README.md CHANGED
@@ -58,11 +58,19 @@ This official repository unveils the TransNormerLLM3 model along with its open-s
58
  | **15B** | 550B | πŸ€—[step144000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step144000-550Btokens) | πŸ€– | 🐯 |
59
  | **15B** | 600B | πŸ€—[step157000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step157000-600Btokens) | πŸ€– | 🐯 |
60
  | **15B** | 650B | πŸ€—[step170000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step170000-650Btokens) | πŸ€– | 🐯 |
 
 
 
 
 
 
 
 
61
  ```python
62
  from transformers import AutoModelForCausalLM, AutoTokenizer
63
 
64
- tokenizer = AutoTokenizer.from_pretrained("OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints", revision='step170000-650Btokens', trust_remote_code=True)
65
- model = AutoModelForCausalLM.from_pretrained("OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints", torch_dtype=torch.bfloat16, revision='step170000-650Btokens', device_map="auto", trust_remote_code=True)
66
  ```
67
 
68
  # Benchmark Results
@@ -77,12 +85,18 @@ The evaluations of all models are conducted using the official settings and the
77
  | **TransNormerLLM3-15B** | 15 | 0.25 | 66.70 | 76.50 | 66.51 | 64.80 | 66.84 | 36.18 | 39.40 | 30.87 | 36.10 |
78
  | **TransNormerLLM3-15B** | 15 | 0.30 | 67.00 | 76.50 | 67.17 | 64.40 | 66.29 | 36.77 | 38.80 | 33.99 | 37.60 |
79
  | **TransNormerLLM3-15B** | 15 | 0.35 | 65.78 | 75.46 | 67.88 | 66.54 | 67.34 | 38.57 | 39.60 | 36.02 | 39.20 |
80
- | **TransNormerLLM3-15B** | 15 | 0.40 | 67.34 | 75.24 | 68.51 | 66.22 | 68.94 | 40.10 | 39.20 | 41.10 | 39.01 |
81
  | **TransNormerLLM3-15B** | 15 | 0.45 | 69.02 | 76.28 | 69.11 | 63.77 | 65.82 | 36.01 | 39.40 | 37.17 | 42.80 |
82
  | **TransNormerLLM3-15B** | 15 | 0.50 | 66.15 | 77.09 | 69.75 | 65.11 | 68.56 | 35.84 | 39.60 | 39.81 | 42.00 |
83
  | **TransNormerLLM3-15B** | 15 | 0.55 | 70.24 | 74.05 | 69.96 | 65.75 | 65.61 | 36.69 | 38.60 | 40.08 | 44.00 |
84
  | **TransNormerLLM3-15B** | 15 | 0.60 | 74.34 | 75.68 | 70.44 | 66.22 | 69.36 | 38.40 | 38.40 | 41.05 | 45.30 |
85
  | **TransNormerLLM3-15B** | 15 | 0.65 | 73.15 | 76.55 | 71.60 | 66.46 | 69.65 | 39.68 | 40.80 | 41.20 | 44.90 |
 
 
 
 
 
 
86
 
87
  > **P**: parameter size (billion). **T**: tokens (trillion). **BoolQ**: acc. **PIQA**: acc. **HellaSwag**: acc_norm. **WinoGrande**: acc. **ARC-easy**: acc. **ARC-challenge**: acc_norm. **OpenBookQA**: acc_norm. **MMLU**: 5-shot acc. **C-Eval**: 5-shot acc.
88
 
 
58
  | **15B** | 550B | πŸ€—[step144000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step144000-550Btokens) | πŸ€– | 🐯 |
59
  | **15B** | 600B | πŸ€—[step157000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step157000-600Btokens) | πŸ€– | 🐯 |
60
  | **15B** | 650B | πŸ€—[step170000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step170000-650Btokens) | πŸ€– | 🐯 |
61
+ | **15B** | 700B | πŸ€—[step183000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step183000-700Btokens) | πŸ€– | 🐯 |
62
+ | **15B** | 750B | πŸ€—[step195500](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step195500-750Btokens) | πŸ€– | 🐯 |
63
+ | **15B** | 800B | πŸ€—[step209000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step209000-800Btokens) | πŸ€– | 🐯 |
64
+ | **15B** | 850B | πŸ€—[step222000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step222000-850Btokens) | πŸ€– | 🐯 |
65
+ | **15B** | 900B | πŸ€—[step235000](https://huggingface.co/OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints/tree/step235000-900Btokens) | πŸ€– | 🐯 |
66
+
67
+
68
+
69
  ```python
70
  from transformers import AutoModelForCausalLM, AutoTokenizer
71
 
72
+ tokenizer = AutoTokenizer.from_pretrained("OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints", revision='step235000-900Btokens', trust_remote_code=True)
73
+ model = AutoModelForCausalLM.from_pretrained("OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints", torch_dtype=torch.bfloat16, revision='step235000-900Btokens', device_map="auto", trust_remote_code=True)
74
  ```
75
 
76
  # Benchmark Results
 
85
  | **TransNormerLLM3-15B** | 15 | 0.25 | 66.70 | 76.50 | 66.51 | 64.80 | 66.84 | 36.18 | 39.40 | 30.87 | 36.10 |
86
  | **TransNormerLLM3-15B** | 15 | 0.30 | 67.00 | 76.50 | 67.17 | 64.40 | 66.29 | 36.77 | 38.80 | 33.99 | 37.60 |
87
  | **TransNormerLLM3-15B** | 15 | 0.35 | 65.78 | 75.46 | 67.88 | 66.54 | 67.34 | 38.57 | 39.60 | 36.02 | 39.20 |
88
+ | **TransNormerLLM3-15B** | 15 | 0.40 | 67.34 | 75.24 | 68.51 | 66.22 | 68.94 | 40.10 | 39.20 | 36.91 | 41.10 |
89
  | **TransNormerLLM3-15B** | 15 | 0.45 | 69.02 | 76.28 | 69.11 | 63.77 | 65.82 | 36.01 | 39.40 | 37.17 | 42.80 |
90
  | **TransNormerLLM3-15B** | 15 | 0.50 | 66.15 | 77.09 | 69.75 | 65.11 | 68.56 | 35.84 | 39.60 | 39.81 | 42.00 |
91
  | **TransNormerLLM3-15B** | 15 | 0.55 | 70.24 | 74.05 | 69.96 | 65.75 | 65.61 | 36.69 | 38.60 | 40.08 | 44.00 |
92
  | **TransNormerLLM3-15B** | 15 | 0.60 | 74.34 | 75.68 | 70.44 | 66.22 | 69.36 | 38.40 | 38.40 | 41.05 | 45.30 |
93
  | **TransNormerLLM3-15B** | 15 | 0.65 | 73.15 | 76.55 | 71.60 | 66.46 | 69.65 | 39.68 | 40.80 | 41.20 | 44.90 |
94
+ | **TransNormerLLM3-15B** | 15 | 0.70 | 73.79 | 78.18 | 73.26 | 67.56 | 71.21 | 43.60 | 40.80 | 43.46 | 47.00 |
95
+ | **TransNormerLLM3-15B** | 15 | 0.75 | 76.45 | 78.07 | 74.22 | 69.30 | 71.21 | 43.43 | 42.20 | 43.46 | 47.80 |
96
+ | **TransNormerLLM3-15B** | 15 | 0.80 | 76.97 | 78.84 | 74.95 | 69.85 | 72.14 | 43.52 | 41.20 | 45.21 | 49.41 |
97
+ | **TransNormerLLM3-15B** | 15 | 0.85 | 72.75 | 78.35 | 75.91 | 70.48 | 74.58 | 45.22 | 41.20 | 46.27 | 49.36 |
98
+ | **TransNormerLLM3-15B** | 15 | 0.90 | 76.09 | 77.91 | 76.49 | 70.88 | 72.14 | 42.92 | 40.20 | 45.70 | 50.15 |
99
+
100
 
101
  > **P**: parameter size (billion). **T**: tokens (trillion). **BoolQ**: acc. **PIQA**: acc. **HellaSwag**: acc_norm. **WinoGrande**: acc. **ARC-easy**: acc. **ARC-challenge**: acc_norm. **OpenBookQA**: acc_norm. **MMLU**: 5-shot acc. **C-Eval**: 5-shot acc.
102