-
Language Modeling Is Compression
Paper • 2309.10668 • Published • 82 -
Small-scale proxies for large-scale Transformer training instabilities
Paper • 2309.14322 • Published • 19 -
Evaluating Cognitive Maps and Planning in Large Language Models with CogEval
Paper • 2309.15129 • Published • 6 -
Vision Transformers Need Registers
Paper • 2309.16588 • Published • 77
Tristan Marechaux
tmarechaux
AI & ML interests
LLMs and ML for code
Recent Activity
updated
a collection
19 days ago
LLMs
upvoted
a
paper
about 2 months ago
Differential Transformer
updated
a collection
about 2 months ago
Theorical
Organizations
Collections
5
-
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 83 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 38 -
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 87
models
1
datasets
None public yet