-
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper ā¢ 2311.01282 ā¢ Published ā¢ 35 -
A Survey on Language Models for Code
Paper ā¢ 2311.07989 ā¢ Published ā¢ 21 -
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Paper ā¢ 2402.17193 ā¢ Published ā¢ 23 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper ā¢ 2409.12917 ā¢ Published ā¢ 134
Collections
Discover the best community collections!
Collections including paper arxiv:2409.12917
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 44 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ā¢ 2307.08691 ā¢ Published ā¢ 8 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 159 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 47
-
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper ā¢ 2309.14717 ā¢ Published ā¢ 44 -
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper ā¢ 2310.09199 ā¢ Published ā¢ 24 -
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams
Paper ā¢ 2310.08678 ā¢ Published ā¢ 12 -
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Paper ā¢ 2310.09478 ā¢ Published ā¢ 19