Edit model card

caomingjun/storyteller

This model is a fine-tuned version of gpt2 on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2057
  • Accuracy: 0.6681

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss
1.764 0.0087 2000 0.5790 1.6696
1.6751 0.0174 4000 0.5903 1.6043
1.6222 0.0261 6000 0.5974 1.5656
1.6077 0.0348 8000 0.6023 1.5383
1.5761 0.0435 10000 0.6066 1.5165
1.5634 0.0522 12000 0.6100 1.4985
1.5397 0.0609 14000 0.6125 1.4854
1.5492 0.0696 16000 0.6141 1.4753
1.5189 0.0783 18000 0.6164 1.4622
1.5083 0.0870 20000 0.6180 1.4540
1.4956 0.0957 22000 0.6201 1.4435
1.488 0.1044 24000 0.6218 1.4359
1.4922 0.1131 26000 0.6233 1.4283
1.4846 0.1218 28000 0.6241 1.4230
1.4676 0.1305 30000 0.6255 1.4168
1.459 0.1392 32000 0.6263 1.4103
1.4594 0.1479 34000 0.6278 1.4048
1.4594 0.1566 36000 0.6285 1.3996
1.4525 0.1653 38000 0.6297 1.3940
1.4608 0.1740 40000 0.6304 1.3907
1.4432 0.1827 42000 0.6311 1.3856
1.4267 0.1914 44000 0.6319 1.3828
1.4265 0.2001 46000 0.6327 1.3796
1.4204 0.2088 48000 0.6337 1.3734
1.4161 0.2175 50000 0.6340 1.3709
1.413 0.2262 52000 0.6352 1.3666
1.4134 0.2349 54000 0.6358 1.3631
1.4231 0.2436 56000 0.6363 1.3607
1.4091 0.2523 58000 0.6369 1.3574
1.3985 0.2610 60000 0.6373 1.3546
1.3957 0.2697 62000 0.6379 1.3524
1.3939 0.2784 64000 0.6385 1.3487
1.3964 0.2871 66000 0.6390 1.3471
1.3822 0.2958 68000 0.6396 1.3437
1.3922 0.3045 70000 0.6399 1.3426
1.3865 0.3132 72000 0.6403 1.3393
1.3853 0.3219 74000 0.6410 1.3369
1.3976 0.3306 76000 0.6415 1.3337
1.3964 0.3393 78000 0.6416 1.3332
1.3846 0.3480 80000 0.6422 1.3310
1.3825 0.3567 82000 0.6425 1.3288
1.3758 0.3654 84000 0.6431 1.3259
1.3685 0.3741 86000 0.6432 1.3243
1.3812 0.3828 88000 0.6440 1.3215
1.3763 0.3915 90000 0.6441 1.3209
1.3637 0.4002 92000 0.6446 1.3188
1.371 0.4089 94000 0.6452 1.3164
1.3548 0.4176 96000 0.6453 1.3154
1.3533 0.4263 98000 0.6455 1.3141
1.3479 0.4350 100000 0.6459 1.3110
1.3479 0.4437 102000 0.6461 1.3098
1.3543 0.4524 104000 0.6466 1.3087
1.3491 0.4611 106000 0.6470 1.3064
1.3645 0.4698 108000 0.6473 1.3052
1.3603 0.4785 110000 0.6478 1.3032
1.3528 0.4872 112000 0.6480 1.3018
1.3508 0.4959 114000 0.6481 1.3010
1.3488 0.5046 116000 0.6484 1.2984
1.3435 0.5133 118000 0.6486 1.2983
1.3635 0.5220 120000 0.6487 1.2969
1.3462 0.5307 122000 0.6493 1.2947
1.3508 0.5394 124000 0.6494 1.2935
1.364 0.5481 126000 0.6495 1.2925
1.3409 0.5568 128000 0.6502 1.2907
1.3402 0.5655 130000 0.6503 1.2907
1.339 0.5742 132000 0.6508 1.2880
1.325 0.5829 134000 0.6508 1.2869
1.3432 0.5916 136000 0.6508 1.2865
1.3478 0.6003 138000 0.6513 1.2844
1.3345 0.6090 140000 0.6516 1.2836
1.3194 0.6177 142000 0.6514 1.2822
1.3342 0.6264 144000 0.6522 1.2813
1.3333 0.6351 146000 0.6523 1.2807
1.3367 0.6438 148000 0.6522 1.2801
1.3293 0.6525 150000 0.6525 1.2787
1.3337 0.6612 152000 0.6528 1.2770
1.3355 0.6699 154000 0.6530 1.2765
1.3288 0.6786 156000 0.6532 1.2753
1.3362 0.6873 158000 0.6534 1.2738
1.3142 0.6960 160000 0.6534 1.2733
1.3109 0.7047 162000 0.6539 1.2720
1.3264 0.7134 164000 0.6542 1.2710
1.3143 0.7221 166000 0.6543 1.2698
1.3118 0.7308 168000 0.6544 1.2698
1.3121 0.7395 170000 0.6546 1.2683
1.3368 0.7482 172000 0.6550 1.2670
1.3077 0.7569 174000 0.6550 1.2668
1.3104 0.7656 176000 0.6552 1.2663
1.316 0.7743 178000 0.6554 1.2649
1.3209 0.7830 180000 0.6558 1.2632
1.3153 0.7917 182000 0.6553 1.2649
1.3025 0.8004 184000 0.6560 1.2626
1.3146 0.8091 186000 0.6562 1.2619
1.3291 0.8178 188000 0.6563 1.2608
1.3062 0.8265 190000 0.6564 1.2598
1.3009 0.8352 192000 0.6566 1.2592
1.2943 0.8439 194000 0.6566 1.2588
1.2977 0.8526 196000 0.6567 1.2578
1.3073 0.8613 198000 0.6571 1.2565
1.2835 0.8700 200000 0.6575 1.2560
1.3019 0.8787 202000 0.6574 1.2554
1.3134 0.8874 204000 0.6578 1.2544
1.3103 0.8961 206000 0.6579 1.2534
1.2897 0.9048 208000 0.6579 1.2531
1.3014 0.9135 210000 0.6577 1.2524
1.304 0.9222 212000 0.6583 1.2514
1.3043 0.9309 214000 0.6581 1.2515
1.2887 0.9396 216000 0.6585 1.2497
1.3022 0.9483 218000 0.6585 1.2490
1.2773 0.9570 220000 0.6587 1.2490
1.3003 0.9657 222000 0.6589 1.2479
1.295 0.9744 224000 0.6589 1.2477
1.2978 0.9831 226000 0.6593 1.2466
1.3013 0.9918 228000 0.6593 1.2460
1.2879 1.0005 230000 0.6594 1.2450
1.2959 1.0092 232000 0.6595 1.2455
1.2831 1.0179 234000 0.6600 1.2436
1.2678 1.0266 236000 0.6599 1.2437
1.2723 1.0353 238000 0.6598 1.2435
1.2792 1.0440 240000 0.6599 1.2429
1.2707 1.0527 242000 0.6601 1.2422
1.2788 1.0614 244000 0.6604 1.2414
1.2667 1.0701 246000 0.6604 1.2410
1.2792 1.0788 248000 0.6605 1.2407
1.2748 1.0875 250000 0.6608 1.2399
1.2669 1.0962 252000 0.6611 1.2392
1.2729 1.1049 254000 0.6608 1.2391
1.263 1.1136 256000 0.6610 1.2387
1.2684 1.1223 258000 0.6611 1.2380
1.2638 1.1310 260000 0.6612 1.2374
1.2993 1.1397 262000 0.6615 1.2366
1.2842 1.1484 264000 0.6614 1.2364
1.2669 1.1571 266000 0.6618 1.2350
1.2698 1.1658 268000 0.6617 1.2353
1.264 1.1745 270000 0.6617 1.2347
1.278 1.1832 272000 0.6618 1.2342
1.269 1.1919 274000 0.6619 1.2341
1.271 1.2006 276000 0.6618 1.2345
1.2727 1.2093 278000 0.6618 1.2333
1.2703 1.2180 280000 0.6624 1.2328
1.2691 1.2267 282000 0.6625 1.2316
1.2771 1.2354 284000 0.6628 1.2304
1.2805 1.2441 286000 0.6626 1.2305
1.2646 1.2528 288000 0.6627 1.2304
1.2523 1.2615 290000 0.6628 1.2300
1.2802 1.2702 292000 0.6630 1.2288
1.2734 1.2789 294000 0.6628 1.2295
1.2625 1.2876 296000 0.6631 1.2287
1.2798 1.2963 298000 0.6632 1.2279
1.2524 1.3050 300000 0.6634 1.2274
1.2658 1.3137 302000 0.6634 1.2268
1.2692 1.3224 304000 0.6635 1.2267
1.26 1.3311 306000 0.6637 1.2261
1.2598 1.3398 308000 0.6636 1.2261
1.2689 1.3485 310000 0.6637 1.2258
1.2619 1.3572 312000 0.6638 1.2253
1.2382 1.3659 314000 0.6640 1.2247
1.2665 1.3746 316000 0.6638 1.2249
1.2451 1.3833 318000 0.6642 1.2231
1.2633 1.3920 320000 0.6643 1.2231
1.2521 1.4007 322000 0.6644 1.2224
1.2804 1.4094 324000 0.6644 1.2224
1.2505 1.4181 326000 0.6646 1.2220
1.2626 1.4268 328000 0.6646 1.2213
1.2631 1.4355 330000 0.6647 1.2211
1.2586 1.4442 332000 0.6646 1.2212
1.2642 1.4529 334000 0.6647 1.2210
1.2738 1.4616 336000 0.6648 1.2204
1.2564 1.4703 338000 0.6650 1.2198
1.2683 1.4790 340000 0.6649 1.2197
1.2591 1.4877 342000 0.6650 1.2194
1.2593 1.4964 344000 0.6651 1.2191
1.2528 1.5051 346000 0.6650 1.2190
1.2658 1.5138 348000 0.6654 1.2182
1.2568 1.5225 350000 0.6653 1.2179
1.2478 1.5312 352000 0.6653 1.2179
1.2649 1.5399 354000 0.6655 1.2171
1.271 1.5486 356000 0.6655 1.2172
1.2506 1.5573 358000 0.6656 1.2167
1.2516 1.5660 360000 0.6657 1.2165
1.2484 1.5747 362000 0.6657 1.2161
1.2417 1.5834 364000 0.6658 1.2159
1.2707 1.5921 366000 0.6660 1.2153
1.2597 1.6008 368000 0.6659 1.2151
1.2522 1.6095 370000 0.6660 1.2148
1.2593 1.6182 372000 0.6661 1.2143
1.2579 1.6269 374000 0.6661 1.2145
1.2385 1.6356 376000 0.6662 1.2139
1.25 1.6443 378000 0.6663 1.2136
1.2412 1.6530 380000 0.6664 1.2131
1.2242 1.6617 382000 0.6665 1.2132
1.2516 1.6704 384000 0.6665 1.2128
1.2533 1.6791 386000 0.6666 1.2122
1.2474 1.6878 388000 0.6667 1.2120
1.2405 1.6965 390000 0.6667 1.2119
1.2466 1.7052 392000 0.6666 1.2119
1.2443 1.7139 394000 0.6667 1.2115
1.2422 1.7226 396000 0.6668 1.2112
1.2298 1.7313 398000 0.6669 1.2111
1.2333 1.7400 400000 0.6669 1.2105
1.2491 1.7487 402000 0.6669 1.2105
1.2368 1.7574 404000 0.6671 1.2102
1.2435 1.7661 406000 0.6673 1.2097
1.2552 1.7748 408000 0.6673 1.2094
1.2509 1.7835 410000 0.6675 1.2089
1.2477 1.7922 412000 0.6673 1.2093
1.2395 1.8009 414000 0.6673 1.2087
1.2417 1.8096 416000 0.6674 1.2088
1.2526 1.8183 418000 0.6674 1.2085
1.2516 1.8270 420000 0.6675 1.2082
1.2542 1.8357 422000 0.6675 1.2083
1.2336 1.8444 424000 0.6676 1.2078
1.2376 1.8531 426000 0.6675 1.2079
1.2481 1.8618 428000 0.6678 1.2076
1.2409 1.8705 430000 0.6677 1.2073
1.2646 1.8792 432000 0.6677 1.2072
1.2329 1.8879 434000 0.6678 1.2070
1.2492 1.8966 436000 0.6679 1.2067
1.2362 1.9053 438000 0.6678 1.2069
1.2625 1.9140 440000 0.6679 1.2066
1.2336 1.9227 442000 0.6680 1.2065
1.2393 1.9314 444000 0.6680 1.2063
1.2393 1.9401 446000 0.6680 1.2062
1.2454 1.9488 448000 0.6680 1.2060
1.2429 1.9575 450000 0.6680 1.2059
1.2477 1.9662 452000 0.6681 1.2059
1.2356 1.9749 454000 0.6681 1.2059
1.242 1.9836 456000 0.6681 1.2057
1.2324 1.9923 458000 0.6681 1.2058

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
4
Safetensors
Model size
124M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from

Dataset used to train caomingjun/storyteller

Evaluation results