Finetune on longer sequences

by joelniklaus - opened

Hi guys,
Great model suite, thanks a lot!
I am interested in finetuning this model on longer sequences (4096 to 8192). What would you say is the easiest way to do this?

I have the same question

it's weird, but on (and only there) there's a GPT-NeoX alibi, might be a pull request that never got merged.

Pythia should be the same family as NeoX, happy to look at other solutions but basically looking for the same thing.

