NikshepShetty commited on
Commit
eecbbe1
1 Parent(s): eea5f05

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -1
README.md CHANGED
@@ -10,6 +10,26 @@ tags:
10
  - adapter
11
  - image-captioning
12
  - peft
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ---
14
 
15
  # Florence-2 Recap-DataComp LoRA Adapter
@@ -57,4 +77,18 @@ This code demonstrates how to:
57
  2. Load the LoRA adapter
58
  3. Process an image and generate a detailed caption
59
 
60
- Note: Make sure you have the required libraries installed: transformers, peft, einops, flash_attn, timm, Pillow, and requests.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  - adapter
11
  - image-captioning
12
  - peft
13
+ model-index:
14
+ - name: Florence-2-DOCCI-FT
15
+ results:
16
+ - task:
17
+ type: image-to-text
18
+ name: Image Captioning
19
+ dataset:
20
+ name: foundation-multimodal-models/DetailCaps-4870
21
+ type: other
22
+ metrics:
23
+ - type: meteor
24
+ value: 0.240
25
+ - type: bleu
26
+ value: 0.150
27
+ - type: cider
28
+ value: 0.035
29
+ - type: capture
30
+ value: 0.553
31
+ - type: rouge-l
32
+ value: 0.294
33
  ---
34
 
35
  # Florence-2 Recap-DataComp LoRA Adapter
 
77
  2. Load the LoRA adapter
78
  3. Process an image and generate a detailed caption
79
 
80
+ Note: Make sure you have the required libraries installed: transformers, peft, einops, flash_attn, timm, Pillow, and requests.
81
+
82
+ ## Evaluation results
83
+
84
+ Our LoRA adapter shows improvements over the base Florence-2 model across all metrics for MORE_DETAILED_CAPTION tag for 1000 images on the foundation-multimodal-models/DetailCaps-4870 dataset:
85
+
86
+ | Metric | Base Model | Adapted Model | Improvement |
87
+ |---------|------------|-----------------------|-------------|
88
+ | METEOR | 0.213 | 0.240 | +12.7% |
89
+ | BLEU | 0.110 | 0.150 | +36.4% |
90
+ | CIDEr | 0.031 | 0.035 | +12.9% |
91
+ | CAPTURE | 0.546 | 0.553 | +1.3% |
92
+ | ROUGE-L | 0.275 | 0.294 | +6.9% |
93
+
94
+ These results demonstrate that our LoRA adapter enhances the image captioning capabilities of the Florence-2 base model, particularly in generating more detailed and accurate captions.