Fine tuning for a more diverse and accurate description of a person/an animal?
This is an amazing model! Thank you author
@vikhyatk
for open sourcing it and thanks for the constant uopdates!
I use Moondream2 in ComfyUI to generate prompts, mainly for asking a short and accurate description with an image of a person/pet, but I think sometimes Moondream can't accuratly describe "the hairstyle" or "the outfits" for person, and "the breed" for the animals, it usually just gives a very general answer such as "short hair, long hair", "brown dog", "grey cat" or just say some random stuff instead of a more specific description such as "middle part hair", "pixie cut", "black tank top", "Labrador", etc. so I'm wondering if it's possible to add more training data and finetune on this area in the future? I think it would be a great help for auto-prompting in the image generation field with this amazing locally runable model. Many thanks!
Following