Sentence Similarity
sentence-transformers
PyTorch
Transformers
English
t5
text-embedding
embeddings
information-retrieval
beir
text-classification
language-model
text-clustering
text-semantic-similarity
text-evaluation
prompt-retrieval
text-reranking
feature-extraction
English
Sentence Similarity
natural_questions
ms_marco
fever
hotpot_qa
mteb
Eval Results
WARN: You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference
#10
by
greenpau
- opened
I am loading the model from cache. When I do that, the model does not load files in 2_Dense
directory (which are 768d vectors) and it load from root which is 1024d vectors.
Why is this message showing up? How should I address the issue?
2023-06-23 11:56:07,939 product-manual-indexer [DEBUG] loading AutoTokenizer from /app/data/cache/instructor-large/hkunlp_instructor-large
2023-06-23 11:56:07,984 product-manual-indexer [DEBUG] loading AutoModel from /app/data/cache/instructor-large/hkunlp_instructor-large
Some weights of T5Model were not initialized from the model checkpoint at /app/data/cache/instructor-large/hkunlp_instructor-large and are newly initialized: ['decoder.block.12.layer.2.DenseReluDense.wo.weight', 'decoder.block.22.layer.1.EncDecAttention.o.weight', 'decoder.block.19.layer.0.SelfAttention.k.weight', 'decoder.block.8.layer.0.SelfAttention.v.weight', 'decoder.block.3.layer.1.EncDecAttention.o.weight', 'decoder.block.17.layer.0.SelfAttention.q.weight', 'decoder.block.10.layer.0.SelfAttention.o.weight', 'decoder.block.18.layer.1.EncDecAttention.k.weight', 'decoder.block.3.layer.0.SelfAttention.q.weight', 'decoder.block.2.layer.2.DenseReluDense.wo.weight', 'decoder.block.17.layer.0.SelfAttention.k.weight', 'decoder.block.22.layer.2.DenseReluDense.wi.weight', 'decoder.block.0.layer.1.EncDecAttention.q.weight', 'decoder.block.18.layer.0.layer_norm.weight', 'decoder.block.19.layer.2.layer_norm.weight', 'decoder.block.21.layer.0.SelfAttention.o.weight', 'decoder.block.9.layer.0.SelfAttention.o.weight', 'decoder.block.7.layer.0.SelfAttention.v.weight', 'decoder.block.12.layer.0.SelfAttention.k.weight', 'decoder.block.7.layer.0.SelfAttention.k.weight', 'decoder.block.3.layer.1.EncDecAttention.k.weight', 'decoder.block.13.layer.0.layer_norm.weight', 'decoder.block.0.layer.2.layer_norm.weight', 'decoder.block.16.layer.1.EncDecAttention.k.weight', 'decoder.block.15.layer.0.SelfAttention.o.weight', 'decoder.block.6.layer.1.EncDecAttention.k.weight', 'decoder.block.2.layer.0.SelfAttention.o.weight', 'decoder.block.21.layer.2.DenseReluDense.wo.weight', 'decoder.block.4.layer.1.layer_norm.weight', 'decoder.block.18.layer.0.SelfAttention.o.weight', 'decoder.block.23.layer.1.EncDecAttention.q.weight', 'decoder.block.17.layer.0.SelfAttention.o.weight', 'decoder.block.20.layer.0.SelfAttention.k.weight', 'decoder.block.5.layer.0.layer_norm.weight', 'decoder.block.9.layer.2.DenseReluDense.wi.weight', 'decoder.block.12.layer.2.DenseReluDense.wi.weight', 'decoder.block.6.layer.2.DenseReluDense.wo.weight', 'decoder.block.23.layer.0.SelfAttention.v.weight', 'decoder.block.19.layer.0.SelfAttention.q.weight', 'decoder.block.2.layer.0.SelfAttention.v.weight', 'decoder.block.1.layer.0.SelfAttention.o.weight', 'decoder.block.1.layer.1.EncDecAttention.o.weight', 'decoder.block.17.layer.1.EncDecAttention.k.weight', 'decoder.block.18.layer.1.EncDecAttention.v.weight', 'decoder.block.0.layer.0.SelfAttention.q.weight', 'decoder.block.22.layer.1.EncDecAttention.k.weight', 'decoder.block.20.layer.2.layer_norm.weight', 'decoder.block.13.layer.0.SelfAttention.o.weight', 'decoder.block.2.layer.1.EncDecAttention.k.weight', 'decoder.block.8.layer.1.EncDecAttention.v.weight', 'decoder.block.8.layer.0.SelfAttention.o.weight', 'decoder.block.17.layer.2.DenseReluDense.wo.weight', 'decoder.block.6.layer.0.SelfAttention.k.weight', 'decoder.block.7.layer.1.layer_norm.weight', 'decoder.block.23.layer.1.layer_norm.weight', 'decoder.block.17.layer.1.EncDecAttention.q.weight', 'decoder.block.5.layer.0.SelfAttention.q.weight', 'decoder.block.19.layer.1.layer_norm.weight', 'decoder.block.13.layer.0.SelfAttention.q.weight', 'decoder.block.3.layer.2.DenseReluDense.wi.weight', 'decoder.block.11.layer.0.SelfAttention.k.weight', 'decoder.block.8.layer.1.layer_norm.weight', 'decoder.block.10.layer.2.DenseReluDense.wi.weight', 'decoder.block.20.layer.2.DenseReluDense.wo.weight', 'decoder.block.14.layer.2.DenseReluDense.wi.weight', 'decoder.block.1.layer.2.DenseReluDense.wi.weight', 'decoder.block.17.layer.2.layer_norm.weight', 'decoder.block.1.layer.0.SelfAttention.q.weight', 'decoder.block.23.layer.2.layer_norm.weight', 'decoder.block.23.layer.0.layer_norm.weight', 'decoder.block.15.layer.0.SelfAttention.k.weight', 'decoder.block.21.layer.2.DenseReluDense.wi.weight', 'decoder.block.13.layer.1.EncDecAttention.q.weight', 'decoder.block.5.layer.1.EncDecAttention.v.weight', 'decoder.block.11.layer.2.DenseReluDense.wi.weight', 'decoder.block.16.layer.0.SelfAttention.k.weight', 'decoder.block.17.layer.1.layer_norm.weight', 'decoder.block.4.layer.0.SelfAttention.o.weight', 'decoder.block.23.layer.2.DenseReluDense.wi.weight', 'decoder.block.15.layer.1.EncDecAttention.o.weight', 'decoder.block.21.layer.0.SelfAttention.k.weight', 'decoder.block.11.layer.1.EncDecAttention.o.weight', 'decoder.block.9.layer.1.EncDecAttention.q.weight', 'decoder.block.13.layer.2.layer_norm.weight', 'decoder.block.3.layer.0.SelfAttention.k.weight', 'decoder.block.20.layer.1.EncDecAttention.k.weight', 'decoder.block.6.layer.1.layer_norm.weight', 'decoder.block.10.layer.2.layer_norm.weight', 'decoder.block.7.layer.1.EncDecAttention.o.weight', 'decoder.block.10.layer.1.EncDecAttention.o.weight', 'decoder.block.18.layer.2.DenseReluDense.wi.weight', 'decoder.block.9.layer.2.DenseReluDense.wo.weight', 'decoder.block.0.layer.1.layer_norm.weight', 'decoder.block.19.layer.0.SelfAttention.v.weight', 'decoder.block.8.layer.1.EncDecAttention.k.weight', 'decoder.block.2.layer.0.SelfAttention.k.weight', 'decoder.block.5.layer.2.DenseReluDense.wo.weight', 'decoder.block.0.layer.0.SelfAttention.k.weight', 'decoder.block.6.layer.0.SelfAttention.q.weight', 'decoder.block.3.layer.1.EncDecAttention.v.weight', 'decoder.block.12.layer.2.layer_norm.weight', 'decoder.block.6.layer.0.layer_norm.weight', 'decoder.block.12.layer.0.SelfAttention.o.weight', 'decoder.block.1.layer.1.EncDecAttention.k.weight', 'decoder.block.10.layer.0.SelfAttention.v.weight', 'decoder.block.11.layer.2.DenseReluDense.wo.weight', 'decoder.block.6.layer.1.EncDecAttention.o.weight', 'decoder.block.10.layer.1.EncDecAttention.q.weight', 'decoder.block.21.layer.2.layer_norm.weight', 'decoder.block.15.layer.1.layer_norm.weight', 'decoder.block.0.layer.1.EncDecAttention.o.weight', 'decoder.block.22.layer.1.EncDecAttention.q.weight', 'decoder.block.9.layer.2.layer_norm.weight', 'decoder.block.22.layer.1.layer_norm.weight', 'decoder.block.4.layer.0.layer_norm.weight', 'decoder.block.5.layer.0.SelfAttention.k.weight', 'decoder.block.5.layer.0.SelfAttention.o.weight', 'decoder.block.14.layer.1.EncDecAttention.v.weight', 'decoder.block.15.layer.1.EncDecAttention.q.weight', 'decoder.block.3.layer.2.DenseReluDense.wo.weight', 'decoder.block.16.layer.1.layer_norm.weight', 'decoder.block.1.layer.1.EncDecAttention.q.weight', 'decoder.block.22.layer.0.SelfAttention.k.weight', 'decoder.block.16.layer.0.SelfAttention.q.weight', 'decoder.block.18.layer.1.EncDecAttention.q.weight', 'decoder.block.11.layer.2.layer_norm.weight', 'decoder.block.5.layer.1.EncDecAttention.q.weight', 'decoder.block.21.layer.0.SelfAttention.q.weight', 'decoder.block.14.layer.0.SelfAttention.k.weight', 'decoder.block.0.layer.2.DenseReluDense.wo.weight', 'decoder.block.8.layer.1.EncDecAttention.q.weight', 'decoder.block.18.layer.0.SelfAttention.v.weight', 'decoder.block.14.layer.0.layer_norm.weight', 'decoder.block.18.layer.1.layer_norm.weight', 'decoder.block.20.layer.0.SelfAttention.q.weight', 'decoder.block.1.layer.1.EncDecAttention.v.weight', 'decoder.block.1.layer.2.DenseReluDense.wo.weight', 'decoder.block.2.layer.2.DenseReluDense.wi.weight', 'decoder.block.13.layer.2.DenseReluDense.wo.weight', 'decoder.block.7.layer.1.EncDecAttention.q.weight', 'decoder.block.21.layer.1.EncDecAttention.q.weight', 'decoder.block.8.layer.2.DenseReluDense.wo.weight', 'decoder.block.10.layer.0.layer_norm.weight', 'decoder.block.22.layer.1.EncDecAttention.v.weight', 'decoder.block.18.layer.0.SelfAttention.q.weight', 'decoder.block.3.layer.0.layer_norm.weight', 'decoder.block.18.layer.1.EncDecAttention.o.weight', 'decoder.block.15.layer.2.DenseReluDense.wi.weight', 'decoder.block.12.layer.1.EncDecAttention.q.weight', 'decoder.block.19.layer.1.EncDecAttention.v.weight', 'decoder.block.8.layer.0.SelfAttention.k.weight', 'decoder.block.6.layer.2.DenseReluDense.wi.weight', 'decoder.block.23.layer.0.SelfAttention.o.weight', 'decoder.block.11.layer.1.EncDecAttention.q.weight', 'decoder.block.5.layer.0.SelfAttention.v.weight', 'decoder.block.8.layer.2.layer_norm.weight', 'decoder.block.22.layer.2.DenseReluDense.wo.weight', 'decoder.block.18.layer.2.DenseReluDense.wo.weight', 'decoder.block.0.layer.0.SelfAttention.relative_attention_bias.weight', 'decoder.block.20.layer.1.EncDecAttention.q.weight', 'decoder.block.4.layer.1.EncDecAttention.o.weight', 'decoder.block.4.layer.1.EncDecAttention.k.weight', 'decoder.block.2.layer.2.layer_norm.weight', 'decoder.block.7.layer.0.SelfAttention.q.weight', 'decoder.block.19.layer.0.SelfAttention.o.weight', 'decoder.block.10.layer.0.SelfAttention.q.weight', 'decoder.block.20.layer.1.layer_norm.weight', 'decoder.block.19.layer.2.DenseReluDense.wo.weight', 'decoder.block.0.layer.2.DenseReluDense.wi.weight', 'decoder.block.8.layer.2.DenseReluDense.wi.weight', 'decoder.block.10.layer.1.EncDecAttention.k.weight', 'decoder.block.3.layer.0.SelfAttention.v.weight', 'decoder.block.14.layer.1.EncDecAttention.q.weight', 'decoder.block.13.layer.1.layer_norm.weight', 'decoder.block.1.layer.1.layer_norm.weight', 'decoder.block.0.layer.0.layer_norm.weight', 'decoder.block.23.layer.0.SelfAttention.q.weight', 'decoder.block.4.layer.2.layer_norm.weight', 'decoder.block.9.layer.0.layer_norm.weight', 'decoder.block.13.layer.0.SelfAttention.k.weight', 'decoder.block.7.layer.2.layer_norm.weight', 'decoder.block.7.layer.0.SelfAttention.o.weight', 'decoder.block.21.layer.1.EncDecAttention.k.weight', 'decoder.block.23.layer.1.EncDecAttention.v.weight', 'decoder.block.16.layer.0.layer_norm.weight', 'decoder.block.3.layer.0.SelfAttention.o.weight', 'decoder.block.18.layer.2.layer_norm.weight', 'decoder.block.11.layer.0.SelfAttention.v.weight', 'decoder.block.22.layer.2.layer_norm.weight', 'decoder.block.9.layer.1.layer_norm.weight', 'decoder.block.21.layer.1.layer_norm.weight', 'decoder.block.16.layer.2.DenseReluDense.wi.weight', 'decoder.block.13.layer.1.EncDecAttention.o.weight', 'decoder.block.4.layer.1.EncDecAttention.v.weight', 'decoder.block.14.layer.2.layer_norm.weight', 'decoder.block.14.layer.1.layer_norm.weight', 'decoder.block.6.layer.1.EncDecAttention.v.weight', 'decoder.block.10.layer.0.SelfAttention.k.weight', 'decoder.block.0.layer.1.EncDecAttention.v.weight', 'decoder.block.4.layer.2.DenseReluDense.wo.weight', 'decoder.block.7.layer.1.EncDecAttention.v.weight', 'decoder.block.5.layer.1.layer_norm.weight', 'decoder.block.2.layer.1.EncDecAttention.q.weight', 'decoder.block.17.layer.0.SelfAttention.v.weight', 'decoder.block.19.layer.1.EncDecAttention.k.weight', 'decoder.block.17.layer.1.EncDecAttention.v.weight', 'decoder.block.3.layer.1.layer_norm.weight', 'decoder.block.13.layer.1.EncDecAttention.k.weight', 'decoder.block.16.layer.1.EncDecAttention.v.weight', 'decoder.block.11.layer.0.SelfAttention.o.weight', 'decoder.block.19.layer.1.EncDecAttention.q.weight', 'decoder.block.23.layer.2.DenseReluDense.wo.weight', 'decoder.block.21.layer.1.EncDecAttention.v.weight', 'decoder.block.20.layer.1.EncDecAttention.v.weight', 'decoder.block.7.layer.2.DenseReluDense.wi.weight', 'decoder.block.14.layer.0.SelfAttention.o.weight', 'decoder.block.13.layer.2.DenseReluDense.wi.weight', 'decoder.block.9.layer.1.EncDecAttention.v.weight', 'decoder.block.2.layer.1.EncDecAttention.o.weight', 'decoder.block.6.layer.1.EncDecAttention.q.weight', 'decoder.block.14.layer.1.EncDecAttention.k.weight', 'decoder.block.15.layer.1.EncDecAttention.k.weight', 'decoder.block.1.layer.0.layer_norm.weight', 'decoder.block.20.layer.1.EncDecAttention.o.weight', 'decoder.block.23.layer.1.EncDecAttention.o.weight', 'decoder.block.19.layer.2.DenseReluDense.wi.weight', 'decoder.block.0.layer.1.EncDecAttention.k.weight', 'decoder.block.6.layer.2.layer_norm.weight', 'decoder.block.4.layer.0.SelfAttention.v.weight', 'decoder.block.4.layer.1.EncDecAttention.q.weight', 'decoder.block.20.layer.2.DenseReluDense.wi.weight', 'decoder.block.8.layer.0.layer_norm.weight', 'decoder.block.16.layer.2.DenseReluDense.wo.weight', 'decoder.block.15.layer.2.layer_norm.weight', 'decoder.final_layer_norm.weight', 'decoder.block.21.layer.0.layer_norm.weight', 'decoder.block.11.layer.1.layer_norm.weight', 'decoder.block.16.layer.2.layer_norm.weight', 'decoder.block.2.layer.0.SelfAttention.q.weight', 'decoder.block.12.layer.1.EncDecAttention.k.weight', 'decoder.block.0.layer.0.SelfAttention.o.weight', 'decoder.block.7.layer.1.EncDecAttention.k.weight', 'decoder.block.9.layer.1.EncDecAttention.k.weight', 'decoder.block.1.layer.0.SelfAttention.k.weight', 'decoder.block.12.layer.0.layer_norm.weight', 'decoder.block.19.layer.0.layer_norm.weight', 'decoder.block.12.layer.0.SelfAttention.v.weight', 'decoder.block.9.layer.0.SelfAttention.k.weight', 'decoder.block.1.layer.0.SelfAttention.v.weight', 'decoder.block.3.layer.2.layer_norm.weight', 'decoder.block.5.layer.2.layer_norm.weight', 'decoder.block.21.layer.0.SelfAttention.v.weight', 'decoder.block.22.layer.0.SelfAttention.o.weight', 'decoder.block.22.layer.0.SelfAttention.q.weight', 'decoder.block.16.layer.0.SelfAttention.o.weight', 'decoder.block.9.layer.0.SelfAttention.q.weight', 'decoder.block.15.layer.0.SelfAttention.q.weight', 'decoder.block.15.layer.0.layer_norm.weight', 'decoder.block.8.layer.1.EncDecAttention.o.weight', 'decoder.block.22.layer.0.layer_norm.weight', 'decoder.block.15.layer.0.SelfAttention.v.weight', 'decoder.block.10.layer.2.DenseReluDense.wo.weight', 'decoder.block.4.layer.2.DenseReluDense.wi.weight', 'decoder.block.0.layer.0.SelfAttention.v.weight', 'decoder.block.20.layer.0.SelfAttention.o.weight', 'decoder.block.18.layer.0.SelfAttention.k.weight', 'decoder.block.14.layer.2.DenseReluDense.wo.weight', 'decoder.block.14.layer.0.SelfAttention.v.weight', 'decoder.block.5.layer.1.EncDecAttention.k.weight', 'decoder.block.3.layer.1.EncDecAttention.q.weight', 'decoder.block.11.layer.0.SelfAttention.q.weight', 'decoder.block.12.layer.1.EncDecAttention.o.weight', 'decoder.block.10.layer.1.layer_norm.weight', 'decoder.block.6.layer.0.SelfAttention.v.weight', 'decoder.block.11.layer.0.layer_norm.weight', 'decoder.block.14.layer.0.SelfAttention.q.weight', 'decoder.block.9.layer.1.EncDecAttention.o.weight', 'decoder.block.11.layer.1.EncDecAttention.v.weight', 'decoder.block.5.layer.1.EncDecAttention.o.weight', 'decoder.block.14.layer.1.EncDecAttention.o.weight', 'decoder.block.13.layer.1.EncDecAttention.v.weight', 'decoder.block.10.layer.1.EncDecAttention.v.weight', 'decoder.block.2.layer.0.layer_norm.weight', 'decoder.block.1.layer.2.layer_norm.weight', 'decoder.block.4.layer.0.SelfAttention.q.weight', 'decoder.block.13.layer.0.SelfAttention.v.weight', 'decoder.block.7.layer.2.DenseReluDense.wo.weight', 'decoder.block.22.layer.0.SelfAttention.v.weight', 'decoder.block.17.layer.1.EncDecAttention.o.weight', 'decoder.block.17.layer.0.layer_norm.weight', 'decoder.block.11.layer.1.EncDecAttention.k.weight', 'decoder.block.16.layer.1.EncDecAttention.q.weight', 'decoder.block.4.layer.0.SelfAttention.k.weight', 'decoder.block.12.layer.1.layer_norm.weight', 'decoder.block.7.layer.0.layer_norm.weight', 'decoder.block.20.layer.0.SelfAttention.v.weight', 'decoder.block.5.layer.2.DenseReluDense.wi.weight', 'decoder.block.12.layer.1.EncDecAttention.v.weight', 'decoder.block.19.layer.1.EncDecAttention.o.weight', 'decoder.block.12.layer.0.SelfAttention.q.weight', 'decoder.block.21.layer.1.EncDecAttention.o.weight', 'decoder.block.20.layer.0.layer_norm.weight', 'decoder.block.8.layer.0.SelfAttention.q.weight', 'decoder.block.17.layer.2.DenseReluDense.wi.weight', 'decoder.block.9.layer.0.SelfAttention.v.weight', 'decoder.block.15.layer.2.DenseReluDense.wo.weight', 'decoder.block.16.layer.1.EncDecAttention.o.weight', 'decoder.block.2.layer.1.layer_norm.weight', 'decoder.block.23.layer.1.EncDecAttention.k.weight', 'decoder.block.6.layer.0.SelfAttention.o.weight', 'decoder.block.2.layer.1.EncDecAttention.v.weight', 'decoder.block.15.layer.1.EncDecAttention.v.weight', 'decoder.block.23.layer.0.SelfAttention.k.weight', 'decoder.block.16.layer.0.SelfAttention.v.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Hi, Thanks a lot for your interests in the INSTRUCTOR model!
As the INSTRUCTOR is only an encoder model, all the weights of the decoder will not be used.
Please re-open the discussion if you have any questions or comments!
multi-train
changed discussion status to
closed