OmniBench: Towards The Future of Universal Omni-Language Models Paper • 2409.15272 • Published 11 days ago • 24
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published 10 days ago • 39
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 14 items • Updated 9 days ago • 69
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding Paper • 2409.03420 • Published 29 days ago • 23
FuzzCoder: Byte-level Fuzzing Test via Large Language Model Paper • 2409.01944 • Published about 1 month ago • 44
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17 • 51
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12 • 114
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23 • 67
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Paper • 2406.18521 • Published Jun 26 • 25