README.md · F16/florence2-large-ft-gufeng_v3 at 8a25f14b1f08712e9aa513f3252c96a240f06219

How to Get Started with the Model

Use the code below to get started with the model.

import requests

from PIL import Image
from transformers import AutoProcessor, AutoModelForCausalLM 


model = AutoModelForCausalLM.from_pretrained("F16/florence2-large-ft-gufeng_v3", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("F16/florence2-large-ft-gufeng_v3", trust_remote_code=True)

prompt = "<MORE_DETAILED_CAPTION>"

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(text=prompt, images=image, return_tensors="pt")

generated_ids = model.generate(
    input_ids=inputs["input_ids"],
    pixel_values=inputs["pixel_values"],
    max_new_tokens=1024,
    do_sample=False,
    num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]

parsed_answer = processor.post_process_generation(generated_text, task=prompt, image_size=(image.width, image.height))

print(parsed_answer)

F16
/

florence2-large-ft-gufeng_v3

How to Get Started with the Model

license: mit