The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27 • 36
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability Paper • 2408.07852 • Published Aug 14 • 14
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community Paper • 2408.08291 • Published Aug 15 • 9
InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning Paper • 2408.07089 • Published Aug 9 • 12
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation Paper • 2408.05928 • Published Aug 12 • 6
Design Proteins Using Large Language Models: Enhancements and Comparative Analyses Paper • 2408.06396 • Published Aug 12 • 8
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data Paper • 2408.06273 • Published Aug 12 • 9
MovieSum: An Abstractive Summarization Dataset for Movie Screenplays Paper • 2408.06281 • Published Aug 12 • 9
OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper • 2408.06941 • Published Aug 13 • 29
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Paper • 2408.07055 • Published Aug 13 • 65
Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers Paper • 2408.05506 • Published Aug 10 • 8
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers Paper • 2408.06195 • Published Aug 12 • 58
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12 • 115
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection Paper • 2408.04284 • Published Aug 8 • 21
Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8 • 154
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation Paper • 2408.02545 • Published Aug 5 • 32
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper • 2407.20183 • Published Jul 29 • 37
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published Jul 31 • 73
ShieldGemma: Generative AI Content Moderation Based on Gemma Paper • 2407.21772 • Published Jul 31 • 13
Searching for Best Practices in Retrieval-Augmented Generation Paper • 2407.01219 • Published Jul 1 • 10
Papers Collection Large Language Model (LLM) and NLP related papers. • 120 items • Updated 2 days ago • 7
Course-Correction: Safety Alignment Using Synthetic Preferences Paper • 2407.16637 • Published Jul 23 • 24
AI Paper of the Day Collection A collection of papers that I think are interesting, one added each day • 182 items • Updated about 4 hours ago • 24
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks? Paper • 2407.15711 • Published Jul 22 • 9
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents Paper • 2403.08715 • Published Mar 13 • 20
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper • 2403.09629 • Published Mar 14 • 72
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 96
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 592
Evaluating Very Long-Term Conversational Memory of LLM Agents Paper • 2402.17753 • Published Feb 27 • 18
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models Paper • 2402.14848 • Published Feb 19 • 18
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding Paper • 2402.16671 • Published Feb 26 • 26
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper • 2402.07827 • Published Feb 12 • 45
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment Paper • 2308.05374 • Published Aug 10, 2023 • 27
Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models Paper • 2308.00304 • Published Aug 1, 2023 • 23