CFBench: A Comprehensive Constraints-Following Benchmark for LLMs Paper • 2408.01122 • Published Aug 2
MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark Paper • 2408.07543 • Published Aug 14
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline Paper • 2408.15079 • Published Aug 27 • 52
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline Paper • 2408.15079 • Published Aug 27 • 52
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System Paper • 2407.06027 • Published Jul 8 • 8
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System Paper • 2407.06027 • Published Jul 8 • 8