Finetune on longer sequences
#1
by
joelniklaus
- opened
Hi guys,
Great model suite, thanks a lot!
I am interested in finetuning this model on longer sequences (4096 to 8192). What would you say is the easiest way to do this?
Cheers,
Joel
I have the same question
it's weird, but on https://moon-ci-docs.huggingface.co/docs/transformers/pr_22810/en/model_doc/gpt_neox_alibi (and only there) there's a GPT-NeoX alibi, might be a pull request that never got merged.
Pythia should be the same family as NeoX, happy to look at other solutions but basically looking for the same thing.