REPOEXEC: Evaluate Code Generation with a Repository-Level Executable Benchmark Paper • 2406.11927 • Published Jun 17 • 11
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22 • 45