Add TF weights
Model converted by the transformers
' pt_to_tf
CLI. All converted model outputs and hidden layers were validated against its Pytorch counterpart.
Maximum crossload output difference=3.898e-05; Maximum crossload hidden layer difference=7.257e-04;
Maximum conversion output difference=3.898e-05; Maximum conversion hidden layer difference=7.257e-04;
List of maximum output differences above the threshold (1e-19):
logits: 3.898e-05
List of maximum hidden layer differences above the threshold (1e-19):
hidden_states[0]: 6.676e-06
hidden_states[1]: 1.574e-05
hidden_states[2]: 2.909e-05
hidden_states[3]: 3.457e-05
hidden_states[4]: 8.821e-05
hidden_states[5]: 3.071e-04
hidden_states[6]: 5.455e-04
hidden_states[7]: 6.905e-04
hidden_states[8]: 7.019e-04
hidden_states[9]: 7.257e-04
hidden_states[10]: 7.019e-04
hidden_states[11]: 6.990e-04
hidden_states[12]: 5.703e-04