Edit model card

gemma7bit-lora-sql

This model is a fine-tuned version of google/gemma-7b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 40.8546

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 1399
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 2
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
47.4397 0.0000 2 112.0961
54.9563 0.0001 4 113.0320
43.0701 0.0001 6 105.0883
29.3374 0.0001 8 93.7564
24.013 0.0001 10 70.5026
5.7244 0.0002 12 70.3644
6.7112 0.0002 14 69.0918
5.139 0.0002 16 67.7594
5.658 0.0002 18 64.8925
3.348 0.0003 20 62.9086
3.0009 0.0003 22 54.9081
3.1078 0.0003 24 47.0123
2.9829 0.0003 26 44.8515
2.4287 0.0004 28 42.1563
2.1561 0.0004 30 39.5831
2.3805 0.0004 32 37.8210
4.199 0.0004 34 36.5321
4.2891 0.0005 36 35.5581
2.8376 0.0005 38 35.1185
2.4216 0.0005 40 35.1674
2.2408 0.0005 42 34.9562
3.4941 0.0006 44 35.2440
3.4866 0.0006 46 34.5079
2.2815 0.0006 48 34.1046
2.2584 0.0006 50 34.0249
2.7932 0.0007 52 34.8069
2.8995 0.0007 54 35.0606
3.3107 0.0007 56 35.8230
3.0793 0.0007 58 36.0362
4.5829 0.0008 60 34.8489
2.6841 0.0008 62 33.6494
3.5738 0.0008 64 32.4676
2.955 0.0008 66 31.9876
2.1847 0.0009 68 31.4324
3.5749 0.0009 70 31.4434
2.0652 0.0009 72 31.6449
1.9506 0.0009 74 31.8311
2.6852 0.0010 76 32.0123
1.8463 0.0010 78 32.2012
2.4999 0.0010 80 32.4074
1.7525 0.0010 82 32.5013
1.865 0.0011 84 32.7458
2.5512 0.0011 86 32.9542
2.041 0.0011 88 33.7792
3.4588 0.0011 90 33.5860
2.2258 0.0012 92 33.9242
2.1416 0.0012 94 34.2110
1.9904 0.0012 96 34.1852
1.9793 0.0012 98 34.1257
3.3329 0.0013 100 34.2512
2.6011 0.0013 102 34.4635
2.4212 0.0013 104 34.5869
1.941 0.0014 106 34.7022
2.4623 0.0014 108 34.9359
2.4267 0.0014 110 35.1085
1.7913 0.0014 112 35.1962
1.6845 0.0015 114 35.5859
3.0888 0.0015 116 35.8237
3.4959 0.0015 118 35.4403
2.5661 0.0015 120 35.3171
2.4044 0.0016 122 35.1409
3.1554 0.0016 124 35.0385
2.0637 0.0016 126 35.4118
5.6131 0.0016 128 35.2343
3.0214 0.0017 130 35.9148
1.771 0.0017 132 36.5919
2.4126 0.0017 134 36.8129
2.5102 0.0017 136 36.6166
6.5612 0.0018 138 36.9545
2.1154 0.0018 140 36.8204
2.533 0.0018 142 36.5374
1.7012 0.0018 144 36.6904
2.2287 0.0019 146 36.1521
4.2646 0.0019 148 36.1889
1.8624 0.0019 150 36.5876
1.9946 0.0019 152 36.6302
2.124 0.0020 154 36.6274
3.01 0.0020 156 36.6652
1.928 0.0020 158 37.0886
2.6035 0.0020 160 37.2648
2.2572 0.0021 162 37.4929
1.5284 0.0021 164 37.7779
1.1103 0.0021 166 37.9401
2.4597 0.0021 168 37.7270
2.4846 0.0022 170 37.4224
2.6234 0.0022 172 36.6518
2.4765 0.0022 174 36.2149
2.0448 0.0022 176 35.9293
2.2736 0.0023 178 35.5881
2.7181 0.0023 180 35.3821
1.9195 0.0023 182 35.2214
2.9274 0.0023 184 35.0837
3.191 0.0024 186 35.1131
2.6804 0.0024 188 35.1649
1.5547 0.0024 190 35.3133
2.2601 0.0024 192 35.6737
2.5229 0.0025 194 36.1338
2.6806 0.0025 196 36.2942
2.2258 0.0025 198 36.4748
1.2856 0.0025 200 36.9566
2.1439 0.0026 202 37.1834
4.0704 0.0026 204 37.5976
2.5138 0.0026 206 38.2877
2.9025 0.0027 208 38.5739
1.8761 0.0027 210 38.3348
1.9228 0.0027 212 38.3183
1.7924 0.0027 214 38.2928
2.7619 0.0028 216 38.1185
2.1031 0.0028 218 37.7249
2.6893 0.0028 220 37.7826
2.255 0.0028 222 37.7949
2.754 0.0029 224 37.8576
1.6294 0.0029 226 38.2263
1.8586 0.0029 228 38.4837
2.4252 0.0029 230 38.7646
2.36 0.0030 232 38.9834
1.4407 0.0030 234 39.1561
1.6109 0.0030 236 39.3041
2.2582 0.0030 238 39.3389
2.8185 0.0031 240 39.5245
1.6233 0.0031 242 39.3154
2.4039 0.0031 244 39.0988
1.7734 0.0031 246 39.0567
1.4779 0.0032 248 39.0881
2.7848 0.0032 250 38.9895
2.2963 0.0032 252 39.2507
2.0605 0.0032 254 39.3339
3.3667 0.0033 256 39.5060
2.9702 0.0033 258 39.5491
2.6734 0.0033 260 39.7907
2.4727 0.0033 262 40.1472
2.7539 0.0034 264 40.4749
1.601 0.0034 266 40.3649
2.1531 0.0034 268 40.2932
1.8656 0.0034 270 40.2728
1.9617 0.0035 272 40.3498
1.8911 0.0035 274 40.3157
2.3878 0.0035 276 40.2882
2.677 0.0035 278 40.4437
2.8035 0.0036 280 40.2423
1.7537 0.0036 282 40.0182
1.5873 0.0036 284 39.8449
1.7802 0.0036 286 39.7251
2.1861 0.0037 288 39.3972
1.9197 0.0037 290 39.4064
2.6752 0.0037 292 39.4320
1.7225 0.0037 294 39.4498
1.7274 0.0038 296 39.4309
3.9891 0.0038 298 40.1752
2.5153 0.0038 300 40.9025
2.0587 0.0038 302 41.4380
2.3115 0.0039 304 41.9152
1.8684 0.0039 306 42.4118
2.0388 0.0039 308 42.8904
2.9396 0.0040 310 43.0102
1.5832 0.0040 312 43.0678
1.897 0.0040 314 43.0292
2.2008 0.0040 316 43.0302
2.4185 0.0041 318 42.8252
1.9265 0.0041 320 42.5088
2.5759 0.0041 322 42.2636
2.9898 0.0041 324 42.1571
1.7106 0.0042 326 41.7366
2.3907 0.0042 328 41.3667
2.4861 0.0042 330 41.3056
1.6998 0.0042 332 41.2167
2.6034 0.0043 334 41.2615
1.6455 0.0043 336 41.2327
1.8484 0.0043 338 41.2317
2.2123 0.0043 340 41.2374
1.8939 0.0044 342 41.1753
1.881 0.0044 344 41.1000
1.5313 0.0044 346 40.9959
2.3099 0.0044 348 40.9817
2.2593 0.0045 350 40.9572
2.2597 0.0045 352 40.9278
2.1038 0.0045 354 40.8672
1.6107 0.0045 356 40.6815
2.0831 0.0046 358 40.5641
2.2921 0.0046 360 40.5117
2.3178 0.0046 362 40.5802
1.6295 0.0046 364 40.4780
2.038 0.0047 366 40.5544
1.7012 0.0047 368 40.7328
2.5292 0.0047 370 40.8337
1.8677 0.0047 372 40.9356
1.5897 0.0048 374 41.0250
1.5096 0.0048 376 41.0558
1.6413 0.0048 378 41.2060
1.6334 0.0048 380 41.2175
2.0367 0.0049 382 41.3215
1.9155 0.0049 384 41.4322
1.9553 0.0049 386 41.4096
2.3982 0.0049 388 41.3870
2.1094 0.0050 390 41.2572
1.9943 0.0050 392 41.1927
2.1017 0.0050 394 41.1805
1.8297 0.0050 396 41.0817
2.2271 0.0051 398 41.0460
2.022 0.0051 400 41.0754
1.8099 0.0051 402 41.0777
2.0973 0.0051 404 41.1348
2.03 0.0052 406 41.1109
1.7342 0.0052 408 41.1719
2.0422 0.0052 410 41.1616
2.6192 0.0052 412 41.0411
1.7107 0.0053 414 41.0704
2.8018 0.0053 416 41.0641
1.3767 0.0053 418 41.0719
1.9952 0.0054 420 41.0151
1.7584 0.0054 422 40.9978
2.1318 0.0054 424 40.9933
2.3412 0.0054 426 40.9837
1.6604 0.0055 428 41.0310
1.6301 0.0055 430 40.9782
2.0232 0.0055 432 40.9377
1.7096 0.0055 434 40.9645
2.1696 0.0056 436 40.9631
1.5297 0.0056 438 40.9690
1.4017 0.0056 440 41.0132
1.7817 0.0056 442 40.9486
1.7264 0.0057 444 40.9499
1.8601 0.0057 446 41.0064
1.9614 0.0057 448 41.0266
2.3045 0.0057 450 41.0035
2.67 0.0058 452 41.0159
1.5752 0.0058 454 40.9748
1.7464 0.0058 456 40.9395
1.9167 0.0058 458 40.9119
1.8777 0.0059 460 40.9021
1.5879 0.0059 462 40.9164
1.942 0.0059 464 40.8847
1.6303 0.0059 466 40.9104
2.1252 0.0060 468 40.9000
2.2879 0.0060 470 40.9209
1.7646 0.0060 472 40.8601
2.3169 0.0060 474 40.8726
1.7797 0.0061 476 40.8563
2.0428 0.0061 478 40.8609
2.4124 0.0061 480 40.8663
2.2955 0.0061 482 40.8601
1.3035 0.0062 484 40.8517
2.611 0.0062 486 40.8781
2.0677 0.0062 488 40.8694
2.1645 0.0062 490 40.8864
2.0708 0.0063 492 40.8633
1.663 0.0063 494 40.8689
1.9784 0.0063 496 40.8672
1.7215 0.0063 498 40.8439
2.2366 0.0064 500 40.8546

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.2.2+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
3
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for Liu-Xiang/gemma7bit-lora-sql

Base model

google/gemma-7b
Adapter
(9044)
this model