Model not working

by bishmdl - opened Mar 12, 2023

Mar 12, 2023

I tried to use the hosted inference API for Align model, but it is not working. Error message received: The model_type 'align' is not recognized. It could be a bleeding edge model, or incorrect.

I have tried to import AlignModel and AlignProcessor from the transformers library, and i get Import Error there as well. There seems to be some error in the model. Any updates/help will be highly appreciated!

dylan-m

Kakao Brain org Mar 13, 2023

@bishmdl

The transformers library with the ALIGN model is not yet released and is only on the main branch,
so to use the current ALIGN model, do pip install git+https://github.com/huggingface/transformers
command to install and use transformers.


from transformers import AlignProcessor, AlignModel

processor = AlignProcessor.from_pretrained("kakaobrain/align-base")
model = AlignModel.from_pretrained("kakaobrain/align-base")

""" 
Downloading (…)rocessor_config.json: 100%|██████████████████████████████████████████████████| 508/508 [00:00<00:00, 59.5kB/s]
Downloading (…)okenizer_config.json: 100%|██████████████████████████████████████████████████| 399/399 [00:00<00:00, 53.4kB/s]
Downloading (…)solve/main/vocab.txt: 100%|█████████████████████████████████████████████████| 232k/232k [00:00<00:00, 279kB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████████████████████████████████████████████| 125/125 [00:00<00:00, 49.1kB/s]
Downloading (…)lve/main/config.json: 100%|███████████████████████████████████████████████| 5.25k/5.25k [00:00<00:00, 660kB/s]
Downloading pytorch_model.bin: 100%|██████████████████████████████████████████████████████| 690M/690M [00:31<00:00, 22.1MB/s]
"""

rabiulawal

Mar 15, 2023

I think something is wrong with the weight initialization of the ALIGN model class. It shows me a warning to train all the layers since they are not initialized!

Sersh

Mar 25, 2023

How to fine-tune this model?

bishmdl

Mar 25, 2023

•

edited Mar 25, 2023

@Sersh You can fine-tune this model in the same way you fine-tune other PyTorch models. Here's one way you can do it:

import torch
import torch.nn as nn
from transformers import AlignModel

class AlignClassifier(nn.module):
    def __init__(self, num_classes):
        super(AlignClassifier, self).__init__()
        self.model = AlignModel.from_pretrained("kakaobrain/align-base")
        # embedding size of Align Model is 640 for each modality.
        hidden_size = 640 + 640
        self.fc = nn.Linear(hidden_size, num_classes)

    def forward(self, **out):
        outputs = self.model(**out)
        image_embeds = outputs.image_embeds
        text_embeds = outputs.text_embeds
        # concatenate both embeddings
        embeds = torch.cat((image_embeds, text_embeds), dim=1)
        outputs = self.fc(embeds)
        return outputs

bishmdl

Mar 25, 2023

@rabiulawal If you use the from_pretrained method correctly, I think the model should have trained weights. Make sure you are not initializing using config as that will essentially give you only the architecture as per your configurations and the weights are randomly initialized.

FuneralForAFriend

Jan 31

model = AlignVisionModel.from_pretrained('/opt/licy/vms/align')
I used this code to load the model Why does it show a lot of missing parameters?

Some weights of AlignVisionModel were not initialized from the model checkpoint at /opt/licy/vms/align and are newly initialized: ['encoder.blocks.24.expansion.expand_bn.running_var', 'encoder.blocks.33.projection.project_bn.running_mean', 'encoder.blocks.12.depthwise_conv.depthwise_norm.num_batches_tracked', 'encoder.blocks.36.projection.project_bn.bias', 'encoder.blocks.40.squeeze_excite.expand.weight', 'encoder.blocks.23.depthwise_conv.depthwise_conv.weight', 'encoder.blocks.38.expansion.expand_bn.bias', 'encoder.blocks.49.squeeze_excite.reduce.bias', 'encoder.blocks.6.expansion.expand_bn.bias', 'encoder.blocks.40.expansion.expand_bn.bias', 'encoder.blocks.44.depthwise_conv.depthwise_norm.running_mean', 'encoder.blocks.44.depthwise_conv.depthwise_norm.weight', 'encoder.blocks.2.projection.project_bn.running_mean', 'encoder.blocks.43.projection.project_bn.running_var', 'encoder.blocks.53.expansion.expand_bn.running_var', 'encoder.blocks.54.depthwise_conv.depthwise_norm.bias', 'encoder.blocks.11.squeeze_excite.reduce.bias', 'encoder.blocks.35.depthwise_conv.depthwise_norm.num_batches_tracked', 'encoder.blocks.49.depthwise_conv.depthwise_conv.weight', 'encoder.blocks.49.depthwise_conv.depthwise_norm.running_var', 'encoder.blocks.53.expansion.expand_bn.weight', 'encoder.blocks.27.projection.project_bn.weight', 'encoder.blocks.6.depthwise_conv.depthwise_norm.num_batches_tracked', 'encoder.blocks.40.depthwise_conv.depthwise_norm.weight', 'encoder.blocks.18.projection.project_bn.running_var', 'encoder.blocks.9.expansion.expand_bn.running_var', 'encoder.blocks.32.squeeze_excite.expand.weight', 'encoder.blocks.40.squeeze_excite.reduce.weight', 'encoder.blocks.42.projection.project_bn.bias', 'encoder.blocks.52.projection.project_conv.weight', 'encoder.blocks.3.depthwise_conv.depthwise_norm.bias', 'encoder.blocks.0.depthwise_conv.depthwise_norm.running_var', 'encoder.blocks.27.projection.project_bn.num_batches_tracked', 'encoder.blocks.35.depthwise_conv.depthwise_norm.weight', 'encoder.blocks.15.expansion.expand_bn.bias', 'encoder.blocks.44.expansion.expand_bn.num_batches_tracked', 'encoder.blocks.3.depthwise_conv.depthwise_norm.weight', 'encoder.blocks.7.expansion.expand_bn.running_var', 'encoder.blocks.10.projection.project_bn.bias', 'encoder.blocks.52.depthwise_conv.depthwise_norm.running_mean', 'encoder.blocks.17.expansion.expand_conv.weight', ...........................

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment