File size: 419 Bytes
56fe6da |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
# Jina CLIP
The Jina CLIP implementation is hosted in this repository. The model uses:
* the EVA 02 architecture for the vision tower
* the Jina BERT with Flash Attention model as a text tower
To use the Jina CLIP model, the following packages are required:
* `torch`
* `timm`
* `transformers`
* `einops`
* `xformers` to use x-attention
* `flash-attn` to use flash attention
* `apex` to use fused layer normalization
|