lemonteaa
/

nanogpt-speedrun

Model card Files Files and versions Community

nanogpt-speedrun / README.md

lemonteaa's picture

Update README.md

0428664 verified 5 days ago

|

history blame contribute delete

731 Bytes

metadata

datasets:
  - HuggingFaceFW/fineweb
base_model:
  - openai-community/gpt2

NanoGPT Speedrun

Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).

Run Info

baseline/

Run on lightning cloud, using one L40S
Batch size set to 32
VRAM usage: 26.95GB (25698MB reported in nvidia-smi)
4 seconds per step, total 3200 steps
Checkpoint saved every 320 steps

Training loss

To experimentally check the neural scaling law:

(Fitted line: log y = -0.11 * log x + 0.9 where x is step (0 to 3200) and y is the training loss)

Demo

Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo

(WIP)