KrishnaKaasyap (Krishna Kaasyap)

upvoted a paper about 1 month ago

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22 • 50

upvoted a collection about 1 month ago

Jamba-1.5

Collection

The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 75

upvoted 2 collections about 2 months ago

Magnum v2 123b

Collection

3 items • Updated Aug 21 • 6

DeepSeek-V2

Collection

7 items • Updated about 1 month ago • 14

upvoted an article about 2 months ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Jul 23

• 198

upvoted a paper about 2 months ago

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9 • 46

upvoted a collection 2 months ago

Llama-3.1 Quantization

Collection

Neural Magic quantized Llama-3.1 models • 21 items • Updated 9 days ago • 35

upvoted 2 articles 2 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29

• 212

Article

ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models

By

•

Jul 27

• 22

upvoted a collection 2 months ago

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 10 days ago • 586

upvoted a paper 3 months ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24 • 55

upvoted 4 collections 4 months ago

upvoted an article 5 months ago

Article

Merge Large Language Models with mergekit

By

•

Jan 9

• 69

upvoted 4 collections 5 months ago

Llama3-ChatQA-1.5

Collection

Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated 5 days ago • 39

LLaVA-Llama-3-8B

Collection

8 items • Updated Apr 28 • 18

Arctic

Collection

A collection of pre-trained dense-MoE Hybrid transformer models • 2 items • Updated Apr 24 • 22

OpenELM Instruct Models

Collection

4 items • Updated 2 days ago • 113

upvoted 3 collections 6 months ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 10 days ago • 676

Phi-3

Collection

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 17 days ago • 473

LLaVA-1.6

Collection

A collection of LLaVA-1.6 checkpoints • 4 items • Updated Jan 31 • 64

upvoted an article 6 months ago

Article

Public Policy at Hugging Face

Apr 8

• 19

upvoted 2 papers 7 months ago

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29 • 52

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 592

upvoted a paper 8 months ago

Keyframer: Empowering Animation Design using Large Language Models

Paper • 2402.06071 • Published Feb 8 • 13

upvoted a collection 8 months ago

Qwen1.5

Collection

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated 18 days ago • 206

upvoted a paper 8 months ago

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29 • 48

upvoted a collection 9 months ago

Llamafied Yi

Collection

Yi base models converted to Llama architecture. • 4 items • Updated Nov 14, 2023 • 9

upvoted a collection 10 months ago

Seamless Communication

Collection

A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16 • 146

upvoted 3 collections 11 months ago

OpenChat

Collection

OpenChat: Advancing Open-source Language Models with Mixed-Quality Data • 7 items • Updated Jul 31 • 33

Recent models: last 100 repos, sorted by creation date

Collection

The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 495

Zephyr 7B

Collection

Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 144

upvoted a paper 12 months ago

Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models

Paper • 2310.13671 • Published Oct 20, 2023 • 18

upvoted 2 collections 12 months ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 212

Historical - Spaces of the Week

Collection

All Spaces of the Week...from all weeks • 636 items • Updated Jan 17 • 19

upvoted 2 papers 12 months ago

Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning

Paper • 2310.12921 • Published Oct 19, 2023 • 19

AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

Paper • 2309.16058 • Published Sep 27, 2023 • 55

upvoted a collection 12 months ago

The Big Benchmarks Collection

Collection

Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 12 items • Updated May 28 • 139

upvoted a paper 12 months ago

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 74

upvoted 2 collections 12 months ago

LLM Leaderboard best models ❤️‍🔥

Collection

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 264 items • Updated Jun 22 • 399

Awesome RLHF

Collection

A curated collection of datasets, models, Spaces, and papers on Reinforcement Learning from Human Feedback (RLHF). • 11 items • Updated Oct 2, 2023 • 7

Krishna Kaasyap

AI & ML interests

Organizations

KrishnaKaasyap's activity

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models

Merge Large Language Models with mergekit

Public Policy at Hugging Face