Xi's picture

Xi

xi0v

·

AI & ML interests

Diffusion Model Merging, LLM Merging, Model Editing and Vision/Multimodal Model Fine-tuning.

Organizations

xi0v's activity

upvoted a collection about 15 hours ago

HyenaDNA Models

HyenaDNA models usable directly with Hugging Face classes like AutoModel. • 8 items • Updated Nov 14, 2023 • 15

upvoted a paper about 15 hours ago

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published 4 days ago • 69

upvoted 3 collections 4 days ago

Qwen 2.5 Coder

Complete collection of Code-specific model series for Qwen2.5 in bnb 4bit, 16bit and GGUF formats. • 35 items • Updated 4 days ago • 17

Qwen2.5-Math

Math-specific model series based on Qwen2.5 • 9 items • Updated Sep 23 • 46

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated about 22 hours ago • 209

upvoted 2 papers 10 days ago

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published 12 days ago • 63

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published 12 days ago • 105

upvoted an article 11 days ago

Article

Glaze and the Effectiveness of Anti-AI Methods for Diffusion Models

By

•

May 15

• 7

upvoted 2 papers 13 days ago

PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

Paper • 2309.10400 • Published Sep 19, 2023 • 26

"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Paper • 2411.02355 • Published 15 days ago • 44

upvoted a paper 14 days ago

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 84

upvoted a paper 19 days ago

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21 • 47

upvoted a collection 19 days ago

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 12 days ago • 95

upvoted a collection 23 days ago

MoE Girl

The MoE Girl series of small, sparse roleplay models • 3 items • Updated 23 days ago • 2

upvoted 2 collections 25 days ago

RPMax v1 Models

RPMax series of models with higher creativity and reduced repetition for "classic" RP chats. • 15 items • Updated about 13 hours ago • 15

Tools for LLM

6 items • Updated 21 days ago • 2

upvoted a paper 28 days ago

Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant

Paper • 2410.15316 • Published about 1 month ago • 10

upvoted an article 28 days ago

Article

🧨 Diffusers welcomes Stable Diffusion 3.5 Large

28 days ago

• 43

upvoted a paper 28 days ago

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published 29 days ago • 57

upvoted an article 28 days ago

Article

Advanced Flux Dreambooth LoRA Training with 🧨 diffusers

By

•

29 days ago

• 27