What does each consolidated.0x.pt consist of? How to load model using them?
#54
by
Keely0419
- opened
I guess one consolidated.0x.pt relates to one expert (am I correct?). But which part of weights does each of them contain?
Is it the expert 0x's weights + all shared weights? Or the expert 0x's weights + a part of the shared weights?
And, how to load a model using consolidated.xx.pt? Is it possible to only load several of consolidated.xx.pt files for inference?
So many confusions... thanks for any help in advance!