Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

merveย 
posted an update 1 day ago
view post
Post
3255
Fine-tune Florence-2 on any task ๐Ÿ”ฅ

Today we release a notebook and a walkthrough blog on fine-tuning Florence-2 on DocVQA dataset @andito @SkalskiP

Blog: https://huggingface.co/blog ๐Ÿ“•
Notebook: https://colab.research.google.com/drive/1hKDrJ5AH_o7I95PtZ9__VlCTNAo1Gjpf?usp=sharing ๐Ÿ“–
Florence-2 is a great vision-language model thanks to it's massive dataset and small size!

This model requires conditioning through task prefixes and it's not as generalist, requiring fine-tuning on a new task, such as DocVQA ๐Ÿ“

We have fine-tuned the model on A100 (and one can also use a smaller GPU with smaller batch size) and saw that model picks up new tasks ๐Ÿฅน

See below how it looks like before and after FT ๐Ÿคฉ
Play with the demo here andito/Florence-2-DocVQA ๐Ÿ„โ€โ™€๏ธ
IlyasMoutawwakilย 
posted an update 2 days ago
view post
Post
3215
Last week, Intel's new Xeon CPUs, Sapphire Rapids (SPR), landed on Inference Endpoints and I think they got the potential to reduce the cost of your RAG pipelines ๐Ÿ’ธ

Why ? Because they come with Intelยฎ AMX support, which is a set of instructions that support and accelerate BF16 and INT8 matrix multiplications on CPU โšก

I went ahead and built a Space to showcase how to efficiently deploy embedding models on SPR for both Retrieving and Ranking documents, with Haystack compatible components: optimum-intel/haystack-e2e

Here's how it works:

- Document Store: A FAISS document store containing the seven-wonders dataset, embedded, indexed and stored on the Space's persistent storage to avoid unnecessary re-computation of embeddings.

- Retriever: It embeds the query at runtime and retrieves from the dataset N documents that are most semantically similar to the query's embedding.
We use the small variant of the BGE family here because we want a model that's fast to run on the entire dataset and has a small embedding space for fast similarity search. Specifically we use an INT8 quantized bge-small-en-v1.5, deployed on an Intel Sapphire Rapids CPU instance.

- Ranker: It re-embeds the retrieved documents at runtime and re-ranks them based on semantic similarity to the query's embedding. We use the large variant of the BGE family here because it's optimized for accuracy allowing us to filter the most relevant k documents that we'll use in the LLM prompt. Specifically we use an INT8 quantized bge-large-en-v1.5, deployed on an Intel Sapphire Rapids CPU instance.

Space: optimum-intel/haystack-e2e
Retriever IE: optimum-intel/fastrag-retriever
Ranker IE: optimum-intel/fastrag-ranker
ucsahinย 
posted an update about 13 hours ago
view post
Post
549
Florence-2 has a great capability of detecting various objects in a zero-shot setting with the task prompt "<OD>". However, if you want to detect specific objects that the base model is not able to in its current form, you can easily finetune it for this particular task. Below I show how to finetune the model to detect tables in a given image, but a similar process can be applied to detect any objects. Thanks to @andito , @merve , and @SkalskiP for sharing the fix for finetuning the Florence-2 model. Please also check their great blog post at https://huggingface.co/blog/finetune-florence2.

Colab notebook: https://colab.research.google.com/drive/1Y8GVjwzBIgfmfD3ZypDX5H1JA_VG0YDL?usp=sharing
Finetuned model: ucsahin/Florence-2-large-TableDetection
singh96amanย 
posted an update 1 day ago
view post
Post
1213
๐—๐˜‚๐—ฑ๐—ด๐—ถ๐—ป๐—ด ๐˜๐—ต๐—ฒ ๐—๐˜‚๐—ฑ๐—ด๐—ฒ๐˜€: ๐—˜๐˜ƒ๐—ฎ๐—น๐˜‚๐—ฎ๐˜๐—ถ๐—ป๐—ด ๐—”๐—น๐—ถ๐—ด๐—ป๐—บ๐—ฒ๐—ป๐˜ ๐—ฎ๐—ป๐—ฑ ๐—ฉ๐˜‚๐—น๐—ป๐—ฒ๐—ฟ๐—ฎ๐—ฏ๐—ถ๐—น๐—ถ๐˜๐—ถ๐—ฒ๐˜€ ๐—ถ๐—ป ๐—Ÿ๐—Ÿ๐— ๐˜€-๐—ฎ๐˜€-๐—๐˜‚๐—ฑ๐—ด๐—ฒ๐˜€
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges (2406.12624)

๐‚๐š๐ง ๐‹๐‹๐Œ๐ฌ ๐ฌ๐ž๐ซ๐ฏ๐ž ๐š๐ฌ ๐ซ๐ž๐ฅ๐ข๐š๐›๐ฅ๐ž ๐ฃ๐ฎ๐๐ ๐ž๐ฌ โš–๏ธ?

We aim to identify the right metrics for evaluating Judge LLMs and understand their sensitivities to prompt guidelines, engineering, and specificity. With this paper, we want to raise caution โš ๏ธ to blindly using LLMs as human proxy.

Blog - https://huggingface.co/blog/singh96aman/judgingthejudges
Arxiv - https://arxiv.org/abs/2406.12624
Tweet - https://x.com/iamsingh96aman/status/1804148173008703509

@singh96aman @kartik727 @Srinik-1 @sankaranv @dieuwkehupkes
RishabhBhardwajย 
posted an update 1 day ago
view post
Post
1623
๐ŸŽ‰ We are thrilled to share our work on model merging. We proposed a new approach, Della-merging, which combines expert models from various domains into a single, versatile model. Della employs a magnitude-based sampling approach to eliminate redundant delta parameters, reducing interference when merging homologous models (those fine-tuned from the same backbone).

Della outperforms existing homologous model merging techniques such as DARE and TIES. Across three expert models (LM, Math, Code) and their corresponding benchmark datasets (AlpacaEval, GSM8K, MBPP), Della achieves an improvement of 3.6 points over TIES and 1.2 points over DARE.

Paper: DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling (2406.11617)
Github: https://github.com/declare-lab/della

@soujanyaporia @Tej3
  • 3 replies
ยท
alvdansenย 
posted an update 2 days ago
view post
Post
1894
A few new styles added as SDXL LoRA:

Midsommar Cartoon
A playful cartoon style featuring bold colors and a retro aesthetic. Personal favorite at the moment.
alvdansen/midsommarcartoon
---
Wood Block XL
I've started training public domain styles to create some interesting datasets. In this case I found a group of images taken from really beautiful and colorful Japanese Blockprints.
alvdansen/wood-block-xl
--
Dimension W
For this model I did actually end up working on an SD 1.5 model as well as an SDXL. I prefer the SDXL version, and I am still looking for parameters I am really happy with for SD 1.5. That said, both have their merits. I trained this with the short film I am working on in mind.
alvdansen/dimension-w
alvdansen/dimension-w-sd15
MonsterMMORPGย 
posted an update 4 days ago
view post
Post
6118
Zero to Hero Stable Diffusion 3 Tutorial with Amazing SwarmUI SD Web UI that Utilizes ComfyUI

https://youtu.be/HKX8_F1Er_w

Do not skip any part of this tutorial to master how to use Stable Diffusion 3 (SD3) with the most advanced generative AI open source APP SwarmUI. Automatic1111 SD Web UI or Fooocus are not supporting the #SD3 yet. Therefore, I am starting to make tutorials for SwarmUI as well. #StableSwarmUI is officially developed by the StabilityAI and your mind will be blown after you watch this tutorial and learn its amazing features. StableSwarmUI uses #ComfyUI as the back end thus it has all the good features of ComfyUI and it brings you easy to use features of Automatic1111 #StableDiffusion Web UI with them. I really liked SwarmUI and planning to do more tutorials for it.

๐Ÿ”— The Public Post (no login or account required) Shown In The Video With The Links โžก๏ธ https://www.patreon.com/posts/stableswarmui-3-106135985

0:00 Introduction to the Stable Diffusion 3 (SD3) and SwarmUI and what is in the tutorial
4:12 Architecture and features of SD3
5:05 What each different model files of Stable Diffusion 3 means
6:26 How to download and install SwarmUI on Windows for SD3 and all other Stable Diffusion models
8:42 What kind of folder path you should use when installing SwarmUI
10:28 If you get installation error how to notice and fix it
11:49 Installation has been completed and now how to start using SwarmUI
12:29 Which settings I change before start using SwarmUI and how to change your theme like dark, white, gray
12:56 How to make SwarmUI save generated images as PNG
13:08 How to find description of each settings and configuration
13:28 How to download SD3 model and start using on Windows
13:38 How to use model downloader utility of SwarmUI
14:17 How to set models folder paths and link your existing models folders in SwarmUI
14:35 Explanation of Root folder path in SwarmUI
14:52 VAE of SD3 do we need to download?
artnitologย 
posted an update about 11 hours ago
view post
Post
498
Recently, we open-sourced YaFSDP, Yandexโ€™s tool for efficient distributed training of LLMs.

Here are some of the key ideas used in YaFSDP to provide speedup and memory savings over FSDP:
โ€ข Allocate and utilize just two buffers throughout the transformer for all collected weights to circumvent the torch memory allocator;
โ€ข Gather small normalization layers at the beginning of the iteration and average the gradients only at the end;
โ€ข Move gradient division to the very end of the backward pass.

To learn more about how YaFSDP works, check out our latest blog post: https://medium.com/yandex/yafsdp-a-tool-for-faster-llm-training-and-optimized-gpu-utilization-is-no-632b7539f5b3
Symbol-LLMย 
posted an update about 23 hours ago
view post
Post
1036
๐Ÿ“ฃThrilled to make public our recent work ENVISIONS !!!

- Without human annotations !
- Without Distilling Strong LLMs !
- Self-improve LLMs in the environment
- Amazing performances on agentic and reasoning tasks
- Insightful analysis on "why" questions

๐Ÿ“ Title: Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models

๐Ÿ“Ž Repo: https://github.com/xufangzhi/ENVISIONS
  • 1 reply
ยท
Taf2023ย 
posted an update 1 day ago