How Do Large Language Models Acquire Factual Knowledge During Pretraining? Paper • 2406.11813 • Published Jun 17 • 30
Efficient Continual Pre-training by Mitigating the Stability Gap Paper • 2406.14833 • Published Jun 21 • 19