jinaai/jina-embeddings-v3 · Update ONNX weights

Update ONNX weights59d337bf

Xenova

Sep 18

•

edited Sep 18

This PR:

Uses external data format to save model (so we don't need the individual tensors)
Adds fp16 version
Slims model.onnx using onnxslim for a more optimized graph

+--------------------+------------------------------------------+------------------------------------------+
|     Model Name     |                model.onnx                |                Op Set: 16                |
+--------------------+------------------------------------------+------------------------------------------+
|     Model Info     |              Original Model              |              Slimmed Model               |
+--------------------+------------------------------------------+------------------------------------------+
|   IN: input_ids    | int64: ('batch_size', 'sequence_length') | int64: ('batch_size', 'sequence_length') |
| IN: attention_mask | int64: ('batch_size', 'sequence_length') | int64: ('batch_size', 'sequence_length') |
|    IN: task_id     |               int64: None                |               int64: None                |
|  OUT: text_embeds  |         float32: ('batch_size',          |         float32: ('batch_size',          |
|                    |      'Addtext_embeds_dim_1', 1024)       |         'sequence_length', 1024)         |
|     OUT: 13049     |      float32: ('batch_size', 1024)       |      float32: ('batch_size', 1024)       |
+--------------------+------------------------------------------+------------------------------------------+
|        Add         |                   486                    |                   438                    |
|        Cast        |                   529                    |                    1                     |
|       Concat       |                   481                    |                   216                    |
|      Constant      |                   4047                   |                    0                     |
|  ConstantOfShape   |                   121                    |                    25                    |
|        Div         |                   337                    |                   121                    |
|       Einsum       |                    48                    |                    48                    |
|       Equal        |                    96                    |                    0                     |
|        Erf         |                    24                    |                    24                    |
|       Expand       |                    96                    |                    96                    |
|       Gather       |                   826                    |                   514                    |
|        Gemm        |                    1                     |                    1                     |
|       MatMul       |                   195                    |                   195                    |
|        Mul         |                   748                    |                   316                    |
|        Neg         |                    48                    |                    48                    |
|        Pow         |                    49                    |                    49                    |
|     ReduceMean     |                    98                    |                    98                    |
|      Reshape       |                   435                    |                   363                    |
|       Shape        |                   553                    |                   145                    |
|       Slice        |                   288                    |                   288                    |
|      Softmax       |                    24                    |                    24                    |
|       Split        |                    24                    |                    24                    |
|        Sqrt        |                    49                    |                    49                    |
|      Squeeze       |                    72                    |                    72                    |
|        Sub         |                    49                    |                    49                    |
|        Tanh        |                    1                     |                    1                     |
|     Transpose      |                    96                    |                    96                    |
|     Unsqueeze      |                   1057                   |                   409                    |
|       Where        |                   120                    |                    24                    |
+--------------------+------------------------------------------+------------------------------------------+
|     Model Size     |                 2.14 GB                  |            1.44 MB (2.14 GB)             |
+--------------------+------------------------------------------+------------------------------------------+
|    Elapsed Time    |                                       33.37 s                                       |
+--------------------+------------------------------------------+------------------------------------------+

Upload slimmed ONNX weights (+ fp16)03ff78d3

Update .gitattributese9ac6388

jupyterjazz

Jina AI org Sep 18

Hi @Xenova , thanks for your contribution!

Does the usage of the ONNX model change with this new format? We have an example in the README, so please update it if necessary. Also, how did you combine the external data into a single file? Could you please share the conversion code? I'd like to apply the same process to https://huggingface.co/jinaai/jina-colbert-v2

bwang0911 changed pull request status to merged Sep 19