openllama-3b-finance

Browse files

Files changed (9) hide show

README.md +267 -0
config.json +37 -0
model-00001-of-00002.safetensors +3 -0
model-00002-of-00002.safetensors +3 -0
model.safetensors.index.json +244 -0
special_tokens_map.json +24 -0
tokenizer.model +3 -0
tokenizer_config.json +36 -0
training_args.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,267 @@

+---
+license: apache-2.0
+base_model: openlm-research/open_llama_3b_v2
+tags:
+- generated_from_trainer
+datasets:
+- financial_phrasebank
+metrics:
+- accuracy
+model-index:
+- name: openllama-3b-finance
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: financial_phrasebank
+      type: financial_phrasebank
+      config: sentences_50agree
+      split: train
+      args: sentences_50agree
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.4142561983471074
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# openllama-3b-finance
+This model is a fine-tuned version of [openlm-research/open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2) on the financial_phrasebank dataset.
+It achieves the following results on the evaluation set:
+- Loss: 3.9007
+- Accuracy: 0.4143
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 1
+- eval_batch_size: 1
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 16.4346       | 0.01  | 20   | 2.5105          | 0.4143   |
+| 1.8681        | 0.01  | 40   | 5.7312          | 0.4143   |
+| 1.8542        | 0.02  | 60   | 5.0027          | 0.4143   |
+| 2.3731        | 0.02  | 80   | 4.2958          | 0.4143   |
+| 2.3024        | 0.03  | 100  | 4.9771          | 0.4143   |
+| 2.6812        | 0.03  | 120  | 4.6762          | 0.4143   |
+| 2.4304        | 0.04  | 140  | 5.2389          | 0.4143   |
+| 2.561         | 0.04  | 160  | 4.4461          | 0.4143   |
+| 2.08          | 0.05  | 180  | 4.6807          | 0.4143   |
+| 4.0186        | 0.05  | 200  | 5.3431          | 0.4143   |
+| 2.7261        | 0.06  | 220  | 4.9663          | 0.4143   |
+| 1.7432        | 0.06  | 240  | 4.4788          | 0.4143   |
+| 2.2759        | 0.07  | 260  | 5.6531          | 0.4143   |
+| 1.8702        | 0.07  | 280  | 6.7118          | 0.4143   |
+| 2.2412        | 0.08  | 300  | 5.0398          | 0.4143   |
+| 1.1515        | 0.08  | 320  | 6.3377          | 0.4143   |
+| 2.6582        | 0.09  | 340  | 5.0585          | 0.4143   |
+| 2.1056        | 0.09  | 360  | 5.6544          | 0.4143   |
+| 3.1513        | 0.1   | 380  | 3.8076          | 0.4143   |
+| 2.0003        | 0.1   | 400  | 5.1281          | 0.4132   |
+| 19.9181       | 0.11  | 420  | 35.4379         | 0.4143   |
+| 29.2872       | 0.11  | 440  | 16.2178         | 0.4143   |
+| 3.4213        | 0.12  | 460  | 13.0984         | 0.4143   |
+| 1.3358        | 0.12  | 480  | 27.2436         | 0.4143   |
+| 4.2725        | 0.13  | 500  | 24.0192         | 0.4143   |
+| 4.9844        | 0.13  | 520  | 12.9378         | 0.1178   |
+| 7.9312        | 0.14  | 540  | 10.8854         | 0.4143   |
+| 1.5126        | 0.14  | 560  | 14.3267         | 0.4143   |
+| 3.9021        | 0.15  | 580  | 10.0051         | 0.4143   |
+| 3.7081        | 0.15  | 600  | 9.5176          | 0.1136   |
+| 3.9107        | 0.16  | 620  | 7.2548          | 0.4143   |
+| 2.8381        | 0.17  | 640  | 3.9992          | 0.4143   |
+| 3.0625        | 0.17  | 660  | 4.3300          | 0.4143   |
+| 1.812         | 0.18  | 680  | 10.6038         | 0.4143   |
+| 6.9616        | 0.18  | 700  | 11.0092         | 0.4143   |
+| 1.7157        | 0.19  | 720  | 14.8428         | 0.4143   |
+| 4.7153        | 0.19  | 740  | 3.6624          | 0.4143   |
+| 2.8871        | 0.2   | 760  | 5.7465          | 0.4143   |
+| 2.4885        | 0.2   | 780  | 12.4440         | 0.4143   |
+| 3.137         | 0.21  | 800  | 14.1504         | 0.4143   |
+| 3.0503        | 0.21  | 820  | 14.1326         | 0.4143   |
+| 2.9254        | 0.22  | 840  | 16.0438         | 0.1291   |
+| 2.711         | 0.22  | 860  | 14.0977         | 0.4143   |
+| 4.8591        | 0.23  | 880  | 9.3210          | 0.1281   |
+| 2.8734        | 0.23  | 900  | 6.3782          | 0.4143   |
+| 2.603         | 0.24  | 920  | 5.1658          | 0.4143   |
+| 4.4641        | 0.24  | 940  | 3.9345          | 0.4143   |
+| 2.3522        | 0.25  | 960  | 5.5901          | 0.1436   |
+| 1.9584        | 0.25  | 980  | 5.0562          | 0.4143   |
+| 2.679         | 0.26  | 1000 | 2.5428          | 0.4143   |
+| 4.13          | 0.26  | 1020 | 1.3911          | 0.4143   |
+| 3.4319        | 0.27  | 1040 | 8.2340          | 0.4143   |
+| 1.9382        | 0.27  | 1060 | 8.4589          | 0.4143   |
+| 2.2712        | 0.28  | 1080 | 6.0251          | 0.4143   |
+| 1.8834        | 0.28  | 1100 | 2.4455          | 0.1436   |
+| 0.9941        | 0.29  | 1120 | 8.7371          | 0.4143   |
+| 3.3895        | 0.29  | 1140 | 6.2867          | 0.1426   |
+| 2.2968        | 0.3   | 1160 | 10.3440         | 0.4143   |
+| 4.9047        | 0.3   | 1180 | 8.0926          | 0.0816   |
+| 4.6894        | 0.31  | 1200 | 3.7347          | 0.3698   |
+| 2.9471        | 0.32  | 1220 | 4.9616          | 0.4143   |
+| 2.9446        | 0.32  | 1240 | 5.8887          | 0.4143   |
+| 1.6756        | 0.33  | 1260 | 7.0233          | 0.4143   |
+| 2.0442        | 0.33  | 1280 | 7.5129          | 0.1322   |
+| 3.7822        | 0.34  | 1300 | 3.1115          | 0.4143   |
+| 2.0277        | 0.34  | 1320 | 5.9831          | 0.4143   |
+| 2.624         | 0.35  | 1340 | 3.2104          | 0.4143   |
+| 2.1893        | 0.35  | 1360 | 4.3662          | 0.1364   |
+| 3.0973        | 0.36  | 1380 | 3.2219          | 0.4143   |
+| 1.9835        | 0.36  | 1400 | 5.1431          | 0.4143   |
+| 2.9711        | 0.37  | 1420 | 6.0129          | 0.4143   |
+| 3.0045        | 0.37  | 1440 | 3.2609          | 0.4143   |
+| 1.0503        | 0.38  | 1460 | 7.6840          | 0.4143   |
+| 2.5946        | 0.38  | 1480 | 5.1945          | 0.4143   |
+| 2.9221        | 0.39  | 1500 | 3.5226          | 0.4143   |
+| 1.5624        | 0.39  | 1520 | 5.3887          | 0.4143   |
+| 2.0339        | 0.4   | 1540 | 4.2434          | 0.4143   |
+| 2.4852        | 0.4   | 1560 | 4.1994          | 0.4143   |
+| 1.7668        | 0.41  | 1580 | 5.5635          | 0.4143   |
+| 2.282         | 0.41  | 1600 | 5.1922          | 0.4143   |
+| 3.2027        | 0.42  | 1620 | 3.9420          | 0.4143   |
+| 2.5766        | 0.42  | 1640 | 4.9683          | 0.4143   |
+| 2.268         | 0.43  | 1660 | 6.2959          | 0.4143   |
+| 3.2091        | 0.43  | 1680 | 4.8009          | 0.4143   |
+| 1.9654        | 0.44  | 1700 | 5.8059          | 0.4143   |
+| 2.17          | 0.44  | 1720 | 5.4482          | 0.4143   |
+| 2.2219        | 0.45  | 1740 | 4.4156          | 0.4143   |
+| 1.9873        | 0.45  | 1760 | 5.1548          | 0.4143   |
+| 2.51          | 0.46  | 1780 | 3.1345          | 0.4143   |
+| 2.8949        | 0.46  | 1800 | 5.3419          | 0.4143   |
+| 1.2941        | 0.47  | 1820 | 6.8446          | 0.4143   |
+| 2.3475        | 0.48  | 1840 | 5.9935          | 0.4143   |
+| 2.7907        | 0.48  | 1860 | 5.8123          | 0.4143   |
+| 2.0038        | 0.49  | 1880 | 6.3927          | 0.4143   |
+| 2.0324        | 0.49  | 1900 | 6.4023          | 0.4143   |
+| 2.3211        | 0.5   | 1920 | 5.9480          | 0.4143   |
+| 2.3883        | 0.5   | 1940 | 5.5011          | 0.4143   |
+| 2.7683        | 0.51  | 1960 | 3.7333          | 0.4143   |
+| 1.6062        | 0.51  | 1980 | 7.2244          | 0.1508   |
+| 2.3866        | 0.52  | 2000 | 4.8682          | 0.4143   |
+| 2.3527        | 0.52  | 2020 | 3.9189          | 0.4143   |
+| 3.0126        | 0.53  | 2040 | 4.3666          | 0.4143   |
+| 1.9683        | 0.53  | 2060 | 5.1474          | 0.4143   |
+| 2.5018        | 0.54  | 2080 | 4.5417          | 0.4143   |
+| 1.555         | 0.54  | 2100 | 5.0804          | 0.4143   |
+| 1.6115        | 0.55  | 2120 | 5.1319          | 0.4143   |
+| 2.2321        | 0.55  | 2140 | 5.3196          | 0.4143   |
+| 2.3614        | 0.56  | 2160 | 4.0629          | 0.4143   |
+| 1.6915        | 0.56  | 2180 | 5.8209          | 0.4143   |
+| 2.4031        | 0.57  | 2200 | 4.3059          | 0.4143   |
+| 1.5659        | 0.57  | 2220 | 5.1369          | 0.4143   |
+| 1.2592        | 0.58  | 2240 | 5.4046          | 0.4143   |
+| 1.5577        | 0.58  | 2260 | 5.8448          | 0.4143   |
+| 1.7656        | 0.59  | 2280 | 5.6683          | 0.4143   |
+| 1.5057        | 0.59  | 2300 | 5.7769          | 0.4143   |
+| 2.3733        | 0.6   | 2320 | 5.0004          | 0.4143   |
+| 2.118         | 0.6   | 2340 | 5.2127          | 0.4143   |
+| 2.2942        | 0.61  | 2360 | 4.8589          | 0.4143   |
+| 2.0524        | 0.61  | 2380 | 3.9148          | 0.4143   |
+| 1.8707        | 0.62  | 2400 | 3.2284          | 0.4143   |
+| 1.6804        | 0.62  | 2420 | 4.9466          | 0.4143   |
+| 2.5137        | 0.63  | 2440 | 4.5307          | 0.4143   |
+| 1.1823        | 0.64  | 2460 | 4.7444          | 0.4143   |
+| 2.9106        | 0.64  | 2480 | 3.7200          | 0.4143   |
+| 1.3376        | 0.65  | 2500 | 4.6969          | 0.4143   |
+| 1.8187        | 0.65  | 2520 | 4.2458          | 0.4143   |
+| 1.8444        | 0.66  | 2540 | 4.6003          | 0.4143   |
+| 2.1427        | 0.66  | 2560 | 4.7394          | 0.4143   |
+| 2.2483        | 0.67  | 2580 | 4.6959          | 0.4143   |
+| 1.5997        | 0.67  | 2600 | 5.5665          | 0.4143   |
+| 2.0095        | 0.68  | 2620 | 4.5815          | 0.4143   |
+| 1.4664        | 0.68  | 2640 | 3.4096          | 0.4143   |
+| 1.4128        | 0.69  | 2660 | 4.2751          | 0.4143   |
+| 2.4907        | 0.69  | 2680 | 3.0278          | 0.4143   |
+| 1.0484        | 0.7   | 2700 | 3.7867          | 0.4143   |
+| 2.7561        | 0.7   | 2720 | 4.0402          | 0.4143   |
+| 1.2491        | 0.71  | 2740 | 3.3789          | 0.4143   |
+| 1.1299        | 0.71  | 2760 | 2.4017          | 0.4143   |
+| 1.9811        | 0.72  | 2780 | 3.3625          | 0.4143   |
+| 2.1781        | 0.72  | 2800 | 3.2631          | 0.4143   |
+| 1.6062        | 0.73  | 2820 | 2.9967          | 0.4143   |
+| 0.928         | 0.73  | 2840 | 5.6052          | 0.4143   |
+| 2.5659        | 0.74  | 2860 | 4.8605          | 0.4143   |
+| 1.4248        | 0.74  | 2880 | 4.8685          | 0.4143   |
+| 2.3335        | 0.75  | 2900 | 4.5013          | 0.4143   |
+| 1.8546        | 0.75  | 2920 | 3.7017          | 0.4143   |
+| 1.5698        | 0.76  | 2940 | 3.8911          | 0.4143   |
+| 1.8653        | 0.76  | 2960 | 4.2637          | 0.4143   |
+| 1.4354        | 0.77  | 2980 | 5.1895          | 0.4143   |
+| 2.0558        | 0.77  | 3000 | 4.4362          | 0.4143   |
+| 2.0876        | 0.78  | 3020 | 4.6924          | 0.4143   |
+| 2.4282        | 0.78  | 3040 | 4.6526          | 0.4143   |
+| 1.4837        | 0.79  | 3060 | 5.2878          | 0.4143   |
+| 2.2982        | 0.8   | 3080 | 5.0637          | 0.4143   |
+| 2.2615        | 0.8   | 3100 | 4.6995          | 0.4143   |
+| 1.7026        | 0.81  | 3120 | 4.4688          | 0.4143   |
+| 1.6352        | 0.81  | 3140 | 4.8815          | 0.4143   |
+| 2.782         | 0.82  | 3160 | 3.6835          | 0.4143   |
+| 0.3105        | 0.82  | 3180 | 3.8391          | 0.4143   |
+| 2.3949        | 0.83  | 3200 | 4.9408          | 0.4143   |
+| 3.0385        | 0.83  | 3220 | 4.3234          | 0.4143   |
+| 2.146         | 0.84  | 3240 | 3.7336          | 0.4143   |
+| 1.9198        | 0.84  | 3260 | 4.2217          | 0.4143   |
+| 0.7858        | 0.85  | 3280 | 4.4744          | 0.4143   |
+| 0.7785        | 0.85  | 3300 | 5.0257          | 0.4143   |
+| 2.7858        | 0.86  | 3320 | 4.8552          | 0.4143   |
+| 2.0922        | 0.86  | 3340 | 4.2950          | 0.4143   |
+| 1.9892        | 0.87  | 3360 | 3.9094          | 0.4143   |
+| 2.2241        | 0.87  | 3380 | 3.7403          | 0.4143   |
+| 2.7226        | 0.88  | 3400 | 3.6119          | 0.4143   |
+| 1.5888        | 0.88  | 3420 | 3.8878          | 0.4143   |
+| 2.7581        | 0.89  | 3440 | 4.0297          | 0.4143   |
+| 1.5373        | 0.89  | 3460 | 4.0980          | 0.4143   |
+| 1.5419        | 0.9   | 3480 | 4.0983          | 0.4143   |
+| 1.7618        | 0.9   | 3500 | 4.2322          | 0.4143   |
+| 1.8487        | 0.91  | 3520 | 4.3258          | 0.4143   |
+| 1.0667        | 0.91  | 3540 | 4.1975          | 0.4143   |
+| 2.0457        | 0.92  | 3560 | 4.2679          | 0.4143   |
+| 1.8133        | 0.92  | 3580 | 4.1908          | 0.4143   |
+| 1.5844        | 0.93  | 3600 | 4.1348          | 0.4143   |
+| 1.7202        | 0.93  | 3620 | 4.1382          | 0.4143   |
+| 1.7118        | 0.94  | 3640 | 4.1135          | 0.4143   |
+| 1.208         | 0.95  | 3660 | 4.1240          | 0.4143   |
+| 1.6942        | 0.95  | 3680 | 4.1595          | 0.4143   |
+| 0.9358        | 0.96  | 3700 | 4.2914          | 0.4143   |
+| 0.9632        | 0.96  | 3720 | 4.3381          | 0.4143   |
+| 1.4406        | 0.97  | 3740 | 4.2782          | 0.4143   |
+| 1.5333        | 0.97  | 3760 | 4.1569          | 0.4143   |
+| 2.8499        | 0.98  | 3780 | 3.9997          | 0.4143   |
+| 1.3767        | 0.98  | 3800 | 3.9549          | 0.4143   |
+| 1.0074        | 0.99  | 3820 | 3.9189          | 0.4143   |
+| 1.7482        | 0.99  | 3840 | 3.8958          | 0.4143   |
+| 1.8591        | 1.0   | 3860 | 3.9007          | 0.4143   |
+### Framework versions
+- Transformers 4.32.0
+- Pytorch 2.0.1+cu117
+- Datasets 2.14.4
+- Tokenizers 0.13.3

config.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "_name_or_path": "openlm-research/open_llama_3b_v2",
+  "architectures": [
+    "LlamaForSequenceClassification"
+  ],
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "hidden_act": "silu",
+  "hidden_size": 3200,
+  "id2label": {
+    "0": "Negative",
+    "1": "Neutral",
+    "2": "Positive"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 8640,
+  "label2id": {
+    "Negative": 0,
+    "Neutral": 1,
+    "Positive": 2
+  },
+  "max_position_embeddings": 2048,
+  "model_type": "llama",
+  "num_attention_heads": 32,
+  "num_hidden_layers": 26,
+  "num_key_value_heads": 32,
+  "pad_token_id": 2,
+  "pretraining_tp": 1,
+  "problem_type": "single_label_classification",
+  "rms_norm_eps": 1e-06,
+  "rope_scaling": null,
+  "tie_word_embeddings": false,
+  "torch_dtype": "float32",
+  "transformers_version": "4.32.0",
+  "use_cache": false,
+  "vocab_size": 32000
+}

model-00001-of-00002.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:947340c7731ba673a0a8182491b8261494e0aa2f2db7b3a8e0c679918d8da150
+size 9990650632

model-00002-of-00002.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:749656db21a5d0a21b31d026cc62d17fb012d310470e677302525ea9e686bd99
+size 3305709376

model.safetensors.index.json ADDED Viewed

	@@ -0,0 +1,244 @@

+{
+  "metadata": {
+    "total_size": 13296332800
+  },
+  "weight_map": {
+    "model.embed_tokens.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.19.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.19.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.20.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.20.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.norm.weight": "model-00002-of-00002.safetensors",
+    "score.weight": "model-00002-of-00002.safetensors"
+  }
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "</s>",
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:91b289e85fa20fd375d8b33dc12f77616f18abc6359804471d1fafcb425fecb8
+size 511574

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,36 @@

+{
+  "add_bos_token": true,
+  "add_eos_token": false,
+  "bos_token": {
+    "__type": "AddedToken",
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "clean_up_tokenization_spaces": false,
+  "eos_token": {
+    "__type": "AddedToken",
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "legacy": null,
+  "model_max_length": 2048,
+  "pad_token": null,
+  "sp_model_kwargs": {},
+  "spaces_between_special_tokens": false,
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "use_default_system_prompt": true
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a0880f6dacc8a59ad09e97992a84926677c8ab52e411680dfa60fea9d2f76058
+size 4155