Edit model card

Convnextv2 finetuned for angle classification

Convnextv2 base-size model finetuned for the classification of camera angles. Cinescale dataset is used to finetune the model for 30 epochs.

Classifies an image into five classes: dutch, high, low, neutral, overhead

Evaluation

On the test set (test.csv), the model has an accuracy of 94.85% and macro-f1 of 92.52%

How to use

from transformers import AutoModelForImageClassification
import torch
from torchvision.transforms import v2
from torchvision.io import read_image, ImageReadMode


model = AutoModelForImageClassification.from_pretrained("gullalc/convnextv2-base-22k-384-cinescale-angle")
im_size = 384

## https://www.pexels.com/photo/man-in-black-dress-walking-in-between-brown-wooden-pews-9614069/
image = read_image("demo/angle_demo.jpg", mode=ImageReadMode.RGB)

transform = v2.Compose([v2.Resize(im_size, antialias=True), 
                            v2.CenterCrop((im_size,im_size)),
                            v2.ToDtype(torch.float32, scale=True),
                            v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])

inputs = transform(image).unsqueeze(0)

with torch.no_grad():
    outputs = model(pixel_values=inputs)
    

predicted_label = model.config.id2label[torch.argmax(outputs.logits).item()]
print(predicted_label)
# --> high
Downloads last month
26
Safetensors
Model size
87.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.