Model Card

Model Details

Architecture: ViT-Large with patch size 14
Training Data: DTD dataset

Training Details

Adam Optimizer with a constant learning rate 1e-5 for 4000 steps training (batch_size=32). Only the vision encoder is fine-tuned.

Evaluation Results

pre-trained: 0.554787278175354
fine-tuned: 0.8547872304916382

Downloads last month: 202

Safetensors

Model size

303M params

Tensor type

F32

Inference Examples

Feature Extraction

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for tanganke/clip-vit-large-patch14_dtd

Base model

openai/clip-vit-large-patch14

Finetuned

(33)

this model

Dataset used to train tanganke/clip-vit-large-patch14_dtd

Collection including tanganke/clip-vit-large-patch14_dtd

CLIP-ViT-L/14 on the eight image classification tasks

Collection

if you find these models helpful, consider citing [our paper](https://arxiv.org/abs/2406.03280) • 9 items • Updated Aug 27