LLM Deployment - a OliP Collection

OliP 's Collections

NewGen small LMs

Leading Leaderboards

2024 Papers of the year

2023 (and before) Papers of the Year

Vision-Language

Audio

Special LMs <10B

Coding

LLM Deployment

updated Sep 18

Runtime error

244

📊

Llm Pricing
Running

812

🚀

Can You Run It? LLM version
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

Paper • 2312.15234 • Published Dec 23, 2023 • 3
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Paper • 2407.11062 • Published Jul 10 • 8
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 33
Running

33

📊

Transformer Calculator