What is the difference in this and original model?
#1
by
hsuyab
- opened
Hello,
- I wanted to understand how this sharded version will be useful compared to the original model weights on
EleutherAI/gpt-j-6b
; - Also, how did you do the sharding, can you share a script for the same.
- I found some errors when trying to load in the tokenizer,
OSError: Can't load tokenizer for 'sgugger/sharded-gpt-j-6B'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'sgugger/sharded-gpt-j-6B' is the correct path to a directory containing all relevant files for a GPT2TokenizerFast tokenizer.
You can use the "sharded" branch on the official repo now. I created this one when it didn't exist.
ohh okay
hsuyab
changed discussion status to
closed