Comparative Analysis of Pre-Trained Code Language Models for Automated Program Repair via Code Infill Generation
Automated Program Repair (APR) has advanced significantly with the emergence of pre-trained Code Language Models (CLMs), enabling the generation of high-quality patches. However, selecting the most suitable CLM for APR remains challenging due to a range of factors, including accuracy, efficiency, and scalability, among others. These factors are interdependent and interact in complex ways, making the selection of a CLM for APR a multifaceted problem.
This study systematically evaluates 20 pre-trained CLMs, ranging from 60M to 16B parameters, on the HumanEval-Java benchmark (163 buggy Java methods). The evaluation examines bug-fixing accuracy, resource consumption, compilability, patch diversity, and sampling strategies (beam search vs. nucleus sampling).
Results indicate that larger models such as CodeLLaMA-13B and StarCoder generally perform better in bug fixing and compiler error handling, but scale alone does not guarantee effectiveness, as some (e.g., CodeGen2) underperform despite their size. Notably, memory usage increases with model size, but time consumption does not exhibit a clear correlation, suggesting that efficiency is influenced by architecture rather than scale alone. Additionally, nucleus sampling slightly outperforms beam search, though the difference is not statistically significant. Since no single CLM fixes all bugs, these findings highlight the potential of hybrid or ensemble-based CLM-driven APR approaches for more robust bug-fixing.
Thu 3 JulDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:45 - 12:30 | |||
10:45 35mTalk | CoCoCoLa: Code Completion Control Language GPCE | ||
11:20 35mTalk | Comparative Analysis of Pre-Trained Code Language Models for Automated Program Repair via Code Infill Generation GPCE Iman Hemati Moghadam Eindhoven University of Technology, Oebele Lijzenga Universiteit Twente, Vadim Zaytsev University of Twente | ||
11:55 35mTalk | Imperative Program Synthesis by Abstract Static Analysis and SMT Mutations GPCE Aleksandar S. Dimovski Mother Teresa University, Skopje |