size mismatch for the model
size mismatch for model.layers.29.self_attn.k_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.self_attn.o_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.29.self_attn.o_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.self_attn.q_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.29.self_attn.q_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.self_attn.v_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.29.self_attn.v_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.mlp.down_proj.qzeros: copying a param with shape torch.Size([344, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.29.mlp.down_proj.scales: copying a param with shape torch.Size([344, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.mlp.gate_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.29.mlp.gate_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.29.mlp.up_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.29.mlp.up_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.30.self_attn.k_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.self_attn.k_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.self_attn.o_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.self_attn.o_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.self_attn.q_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.self_attn.q_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.self_attn.v_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.self_attn.v_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.mlp.down_proj.qzeros: copying a param with shape torch.Size([344, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.mlp.down_proj.scales: copying a param with shape torch.Size([344, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.mlp.gate_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.30.mlp.gate_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.30.mlp.up_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.30.mlp.up_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.31.self_attn.k_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.self_attn.k_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.self_attn.o_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.self_attn.o_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.self_attn.q_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.self_attn.q_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.self_attn.v_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.self_attn.v_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.mlp.down_proj.qzeros: copying a param with shape torch.Size([344, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.mlp.down_proj.scales: copying a param with shape torch.Size([344, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.mlp.gate_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.31.mlp.gate_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.31.mlp.up_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.31.mlp.up_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
I got this error, any help?
make sure to load it as a 32 large group size model, you might've set whatever script you had there to load it as a 128g model.