arxiv_cs_cl Β· Jun 16, 2026 Β· paper
ReproRepo: Scaling Reproducibility Audits with GitHub Repository Issues
In brief
Reproducing research results from papers and released code is central to scientific progress. Existing works have introduced benchmarks to evaluate whether LLM agents can assist with reproducibility, but they are diff...
agentevaluationcodex