Can Blip2ForImageTextRetrieval be trained with Trainer?
#5
by
wang-sy
- opened
In the definition of Blip2ImageTextMatchingModelOutput
, loss was defined as
Args:
loss (`torch.FloatTensor` of shape `(1,)`, *optional*, returned when `return_loss` is `True`):
Contrastive loss for image-text similarity.
However the calculation of loss was not done in the forward loop of Blip2ForImageTextRetrieval
, am I missing out on this calculation, where is loss calculated?
Is the training of Blip2ForImageTextRetrieval
supported by the Trainer?
Thank you for the great work!