ArturD (Artur Daveyan)

upvoted a paper 3 days ago

TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices

Paper • 2410.00531 • Published 5 days ago • 27

upvoted 2 collections 5 days ago

Yi-Coder

Collection

4 items • Updated Sep 4 • 29

BRAG-v0.1

Collection

BRAG is a series of SLMs (Small Language Models) specifically trained for RAG tasks. We release models with size 1.5b, 7b and 8b. • 4 items • Updated Aug 4 • 13

upvoted a collection 7 days ago

Llama 3.2

Collection

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 11 items • Updated 10 days ago • 327

upvoted a paper 8 days ago

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published 11 days ago • 40

upvoted an article 8 days ago

Article

Llama can now see and run on your device - welcome Llama 3.2

11 days ago

• 137

upvoted a paper 10 days ago

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published 12 days ago • 10

upvoted an article 15 days ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

18 days ago

• 144

upvoted 4 papers 15 days ago

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published 18 days ago • 81

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published 17 days ago • 35

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Paper • 2409.10819 • Published 19 days ago • 17

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published 17 days ago • 69

upvoted a paper 21 days ago

SongCreator: Lyrics-based Universal Song Generation

Paper • 2409.06029 • Published 26 days ago • 19

upvoted a paper 28 days ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 592

upvoted 3 papers about 1 month ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 111

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published Aug 22 • 86

FocusLLM: Scaling LLM's Context by Parallel Decoding

Paper • 2408.11745 • Published Aug 21 • 23

upvoted an article about 1 month ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

By

•

Aug 19

• 72

upvoted 3 papers about 1 month ago

mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

Paper • 2408.04840 • Published Aug 9 • 31

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9 • 46

MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models

Paper • 2408.01337 • Published Aug 2 • 10

upvoted an article about 1 month ago

Article

Memory-efficient Diffusion Transformers with Quanto and Diffusers

Jul 30

• 52

upvoted 2 papers about 1 month ago

RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation

Paper • 2408.02545 • Published Aug 5 • 32

Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent

Paper • 2407.21646 • Published Jul 31 • 18

upvoted an article about 1 month ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29

• 212

upvoted a paper about 2 months ago

CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Paper • 2408.03910 • Published Aug 7 • 15

upvoted a collection 2 months ago

Tiny Models

Collection

8 items • Updated Aug 24 • 8

upvoted a paper 2 months ago

KAN or MLP: A Fairer Comparison

Paper • 2407.16674 • Published Jul 23 • 40

upvoted an article 2 months ago

Article

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

Jul 25

• 18

upvoted a paper 2 months ago

GEB-1.3B: Open Lightweight Large Language Model

Paper • 2406.09900 • Published Jun 14 • 20

upvoted a collection 3 months ago

NuminaMath

Collection

Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 6 items • Updated Jul 21 • 57

upvoted a paper 3 months ago

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16 • 125

upvoted a collection 3 months ago

FP8 LLMs for vLLM

Collection

Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! • 43 items • Updated 8 days ago • 53

upvoted an article 3 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 244

upvoted 2 papers 3 months ago

Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On

Paper • 2407.08348 • Published Jul 11 • 50

MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions

Paper • 2407.06358 • Published Jul 8 • 17

upvoted an article 3 months ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24

• 170

upvoted 2 collections 4 months ago

VideoLLaMA 2

Collection

Optimized VideoLLaMA with improved spatial-temporal modeling and better audio understanding capability • 11 items • Updated Aug 31 • 17

Qwen2

Collection

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated 18 days ago • 340

upvoted a paper 4 months ago

OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published May 20 • 33

upvoted a paper 5 months ago

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

upvoted a collection 5 months ago

Phi-3

Collection

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 17 days ago • 473

upvoted an article 5 months ago

Article

Design choices for Vision Language Models in 2024

By

•

Apr 16

• 24

upvoted 4 papers 6 months ago

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Paper • 2404.07973 • Published Apr 11 • 30

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Paper • 2404.07413 • Published Apr 11 • 36

RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion

Paper • 2404.07199 • Published Apr 10 • 25

ReALM: Reference Resolution As Language Modeling

Paper • 2403.20329 • Published Mar 29 • 20

upvoted 2 papers 7 months ago

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

Paper • 2402.16840 • Published Feb 26 • 23

YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Paper • 2402.13616 • Published Feb 21 • 45

upvoted 2 papers 8 months ago

FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19 • 48

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16 • 29

upvoted a collection 8 months ago

Gemma release

Collection

Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325

upvoted a paper 8 months ago

Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23 • 86

upvoted a collection 9 months ago

AIM

Collection

AIM: Autoregressive Image Models • 5 items • Updated 1 day ago • 48

upvoted a paper 9 months ago

Synthesizing Moving People with 3D Control

Paper • 2401.10889 • Published Jan 19 • 12

upvoted a paper 10 months ago

LLM360: Towards Fully Transparent Open-Source LLMs

Paper • 2312.06550 • Published Dec 11, 2023 • 56

upvoted a paper about 1 year ago

On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models

Paper • 2307.09793 • Published Jul 19, 2023 • 46

Artur Daveyan

AI & ML interests

Organizations

ArturD's activity

Llama can now see and run on your device - welcome Llama 3.2

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

Memory-efficient Diffusion Transformers with Quanto and Diffusers

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

SmolLM - blazingly fast and remarkably powerful

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Design choices for Vision Language Models in 2024