Add model

Browse files

Files changed (9) hide show

README.md +42 -0
merges.txt +0 -0
open_clip_config.json +35 -0
open_clip_model.safetensors +3 -0
open_clip_pytorch_model.bin +3 -0
special_tokens_map.json +30 -0
tokenizer.json +0 -0
tokenizer_config.json +30 -0
vocab.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+license: mit
+library_name: open_clip
+pipeline_tag: zero-shot-image-classification
+---
+[[Paper]](https://openreview.net/forum?id=e3scLKNiNg&noteId=e3scLKNiNg) [[GitHub]](https://github.com/fra31/perceptual-metrics)
+Robust perceptual metric, based on CLIP model `laion/CLIP-convnext_base_w-laion2B-s13B-b82K-augreg`
+Adversarially fine-tuned with FARE ([Schlarmann et al. (2024)](https://arxiv.org/abs/2402.12336)) on ImageNet with infinity-norm and radius 4/255.
+Performance on the perceptual similarity task [NIGHTS](https://dreamsim-nights.github.io):
+```
+Clean     L-inf, eps=4/255     L2, eps=3
+90.6      74.3                 66.1
+```
+## Usage
+```python
+model, _, image_processor = open_clip.create_model_and_transforms('hf-hub:chs20/FARE4-convnext_base_w-laion2B-s13B-b82K-augreg')
+```
+## Citation
+If you find this model useful, please consider citing our papers:
+```bibtex
+@inproceedings{croce2024adversarially,
+  title={Adversarially Robust CLIP Models Induce Better (Robust) Perceptual Metrics},
+  author={Croce, Francesco and Schlarmann, Christian and Singh, Naman Deep and Hein, Matthias},
+  year={2024},
+  booktitle={{ICML Workshop on Foundation Models in the Wild}}
+}
+```
+```bibtex
+@inproceedings{schlarmann2024robustclip,
+    title={Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models},
+    author={Schlarmann, Christian and Singh, Naman Deep and Croce, Francesco and Hein, Matthias},
+    year={2024},
+    booktitle={{ICML}}
+}
+```

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

open_clip_config.json ADDED Viewed

	@@ -0,0 +1,35 @@

+{
+  "model_cfg": {
+    "embed_dim": 640,
+    "vision_cfg": {
+      "timm_model_name": "convnext_base",
+      "timm_model_pretrained": false,
+      "timm_pool": "",
+      "timm_proj": "linear",
+      "timm_drop": 0.0,
+      "timm_drop_path": 0.1,
+      "image_size": 256
+    },
+    "text_cfg": {
+      "context_length": 77,
+      "vocab_size": 49408,
+      "width": 640,
+      "heads": 10,
+      "layers": 12
+    }
+  },
+  "preprocess_cfg": {
+    "mean": [
+      0.48145466,
+      0.4578275,
+      0.40821073
+    ],
+    "std": [
+      0.26862954,
+      0.26130258,
+      0.27577711
+    ],
+    "interpolation": "bicubic",
+    "resize_mode": "shortest"
+  }
+}

open_clip_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:26360ef11b916ec985996655773a271f6d4a28e417aed4aa212cdf8b671194ba
+size 717597012

open_clip_pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0eb837189a0d74c3c484a7b6eda776cd6734cc0b64f6bd3cd3c7f5fedb62ccec
+size 717742056

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,30 @@

+{
+  "bos_token": {
+    "content": "<|startoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,30 @@

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "49406": {
+      "content": "<|startoftext|>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "49407": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<|startoftext|>",
+  "clean_up_tokenization_spaces": true,
+  "do_lower_case": true,
+  "eos_token": "<|endoftext|>",
+  "errors": "replace",
+  "model_max_length": 77,
+  "pad_token": "<|endoftext|>",
+  "tokenizer_class": "CLIPTokenizer",
+  "unk_token": "<|endoftext|>"
+}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff