gmastrapas's picture
feat: initial commit
56fe6da
|
raw
history blame
419 Bytes
# Jina CLIP
The Jina CLIP implementation is hosted in this repository. The model uses:
* the EVA 02 architecture for the vision tower
* the Jina BERT with Flash Attention model as a text tower
To use the Jina CLIP model, the following packages are required:
* `torch`
* `timm`
* `transformers`
* `einops`
* `xformers` to use x-attention
* `flash-attn` to use flash attention
* `apex` to use fused layer normalization