Molmo Collection Artifacts for open multimodal language models. ā¢ 5 items ā¢ Updated 7 days ago ā¢ 271
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases ā¢ 5 items ā¢ Updated Sep 25 ā¢ 683
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. ā¢ 55 items ā¢ Updated Sep 18 ā¢ 206
Canonical models Collection This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace ā¢ 68 items ā¢ Updated Feb 13 ā¢ 13
Switch-Transformers release Collection This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. ā¢ 9 items ā¢ Updated Jul 31 ā¢ 15
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper ā¢ 2312.11514 ā¢ Published Dec 12, 2023 ā¢ 258
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning Paper ā¢ 2301.13688 ā¢ Published Jan 31, 2023 ā¢ 8