Prince Canuma's picture

Prince Canuma

prince-canuma

·

AI & ML interests

None yet

Recent Activity

updated a model about 5 hours ago

mlx-community/Molmo-7B-D-0924-bf16

updated a model about 6 hours ago

mlx-community/Florence-2-large-ft-bf16

updated a model about 7 hours ago

mlx-community/Florence-2-large-ft-8bit

Organizations

prince-canuma's activity

upvoted a collection 10 days ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated 3 days ago • 224

upvoted an article 23 days ago

Article

Decoding Strategies in Large Language Models

By

•

23 days ago

• 38

upvoted a paper 24 days ago

Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Paper • 2410.18558 • Published 29 days ago • 18

upvoted an article 25 days ago

Article

Visually Multilingual: Introducing mcdse-2b

By

•

25 days ago

• 37

upvoted 2 collections about 1 month ago

Granite 3.0 Language Models

A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 17 days ago • 89

LayerSkip

Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 • 8 items • Updated about 7 hours ago • 43

upvoted 9 papers about 1 month ago

TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

Paper • 2410.05076 • Published Oct 7 • 6

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17 • 72

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Paper • 2409.17481 • Published Sep 26 • 46

Hyper-Connections

Paper • 2409.19606 • Published Sep 29 • 20

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

Paper • 2410.02707 • Published Oct 3 • 48

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 166

LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning

Paper • 2410.02884 • Published Oct 3 • 50

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1 • 144

Selective Attention Improves Transformer

Paper • 2410.02703 • Published Oct 3 • 23

upvoted 3 papers about 2 months ago

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

Paper • 2409.17422 • Published Sep 25 • 24

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21 • 27

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Paper • 2409.18042 • Published Sep 26 • 36

upvoted a collection about 2 months ago

Molmo

Artifacts for open multimodal language models. • 5 items • Updated 7 days ago • 271

upvoted a paper 3 months ago

HMoE: Heterogeneous Mixture of Experts for Language Modeling

Paper • 2408.10681 • Published Aug 20 • 8