Michael Pieler

MicPie

AI & ML interests

ML

Recent Activity

liked a Space about 1 month ago
k-mktr/gpu-poor-llm-arena
View all activity

Organizations

EleutherAI's profile picture chemNLP's profile picture OpenBioML's profile picture Stability AI's profile picture Inverse Scaling Prize's profile picture DataComp 's profile picture

MicPie's activity

reacted to m-ric's post with ๐Ÿ‘€ 4 months ago
view post
Post
2252
๐—ง๐—ต๐—ฒ ๐—ต๐˜‚๐—ด๐—ฒ ๐—ฐ๐—ผ๐˜€๐˜ ๐—ผ๐—ณ ๐—ฟ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต ๐—ผ๐—ป ๐—ณ๐—ฟ๐—ผ๐—ป๐˜๐—ถ๐—ฒ๐—ฟ ๐—Ÿ๐—Ÿ๐— ๐˜€ ๐Ÿ’ธ

Google DeepMind recently released a great paper that shows optimal hyperparameters to train across different regimes: Scaling Exponents Across Parameterizations and Optimizers, with data from 10,000 training runs.

One engineer decided to quantify the price of such a large-scale experiment.

๐Ÿ˜ฌ And the bill is hefty: ~13M USD

This exact number is to take with a grain of salt because many approximations were necessary to get the final result.

โ›”๏ธ But still this ballpark means that for this sole experiment, the price is way over what most startups or research labs could afford.

This means that open-sourcing research is more important than ever, to put everyone in the ecosystem on a roughly equal footing. Don't let OpenAI run first, they'll keep everything for themselves!

Read the full post that quantifies the paper's cost ๐Ÿ‘‰ https://152334h.github.io/blog/scaling-exponents/
  • 1 reply
ยท
New activity in sfairXC/FsfairX-LLaMA3-RM-v0.1 7 months ago

Training details?

1
#2 opened 7 months ago by MicPie
New activity in JeanKaddour/minipile about 1 year ago

Domain and provenance annotation

9
#1 opened about 1 year ago by haukur
New activity in allenai/peS2o over 1 year ago