The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 602
Design2Code: How Far Are We From Automating Front-End Engineering? Paper • 2403.03163 • Published Mar 5 • 93
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement Paper • 2402.14658 • Published Feb 22 • 82
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 71 items • Updated 11 days ago • 88
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents Paper • 2310.19923 • Published Oct 30, 2023 • 13
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence Paper • 2401.14196 • Published Jan 25 • 46
Transformers.js demos Collection A collection of my favorite WebML demos, built with Transformers.js! • 30 items • Updated Jul 11 • 87
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model Paper • 2401.09417 • Published Jan 17 • 58
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models Paper • 2401.04658 • Published Jan 9 • 24
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5 • 40
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM Paper • 2401.02994 • Published Jan 4 • 47
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 501
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 258
TinyGSM: achieving >80% on GSM8k with small language models Paper • 2312.09241 • Published Dec 14, 2023 • 37
Mistral 7B 16k Collection All Mistral based models that have a 16k context size and have been finetuned. • 7 items • Updated Dec 11, 2023 • 4