Cheems: Wonderful Matrices More Efficient and More Effective Architecture Paper • 2407.16958 • Published Jul 24 • 2
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper • 2407.12327 • Published Jul 17 • 77
GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression Paper • 2407.12077 • Published Jul 16 • 54