NuExtract-large 7b and NuExtract 3.8B have same size model file
The size of model.safetensors
is the same for both NuExtract-large 7b
and NuExtract 3.8B
. Since the NuExtract-large
has nearly twice the number of parameters, shouldn't the file sizes be different? I'm checking this to verify whether these two are the same model or different models.
No, it's not the same model, this one is base on phi3-small. The difference come from the fact that you need to save/load the weight in bf16 (mostly because we couln't make the training in full precision because of flash-attention).
Ohh I see, you have also updated the model loading script as wellmodel = AutoModelForCausalLM.from_pretrained("numind/NuExtract", trust_remote_code=True, torch_dtype=torch.bfloat16)
I checked 10 hrs ago, it wasmodel = AutoModelForCausalLM.from_pretrained("numind/NuExtract", trust_remote_code=True)
so I was a bit confused it's same for both NuExtract-large 7b
and NuExtract 3.8B
Thanks for clarifying @Alexandre-Numind