what's the dtype (precision) used during training?

#33

by keunwoochoi - opened Aug 26

Aug 26

hi, thanks for the nice model.
when we get the model through transformers, the default data format is float32.

but the technical report says the model was done on bfloat16.

i'd like to confirm what is the correct dtype. asking because it is critical as the size doubles with float32.

any insight would be great. thanks!

zpn

Nomic AI org Aug 26

•

hey @keunwoochoi the correct dtype is bf16 !

zpn changed discussion status to closed Aug 26

zpn

Nomic AI org Aug 26

To clarify, we use torch.autocast with bfloat16 precision during training

zpn

Nomic AI org Aug 26

I've also run inference of the model in bfloat16 and found pretty similar results but YMMV @keunwoochoi

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment