AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

TLDR

This paper introduces Multiagent-Code Generation (AgentCoder), a novel solution comprising a multi-agent framework with specialized agents: the programmer agent, the test designer agent, and the test executor agent that ensures more effective code generation, surpassing the limitations of single-agent models and previous strategies.

Abstract

Advances in natural language processing (NLP) have been significantly boosted by the development of transformer-based large language models (LLMs). These models have revolutionized NLP tasks, particularly in code generation, aiding developers in creating software with enhanced efficiency. Despite their advances, challenges remain in balancing code snippet generation with effective test case generation and execution. To address these issues, this paper introduces Multiagent-Code Generation (AgentCoder), a novel solution comprising a multi-agent framework with specialized agents: the programmer agent, the test designer agent, and the test executor agent. During the coding procedure, the programmer agent focuses on the code generation and refinement based on the test executor agent’s feedback. The test designer agent generates test cases for the generated code, and the test executor agent runs the code with the test cases and writes feedback to the programmer. This collaborative system ensures more effective code generation, surpassing the limitations of single-agent models and previous strategies. Our extensive experiments on 12 LLMs and 13 optimisation approaches showcase AgentCoder’s superior performance over existing code generation models and prompt engineering techniques across various benchmarks. For example, AgentCoder achieves 77.4% and 89.1% pass@1 in HumanEval-ET and MBPP-ET with GPT-3.5, while state-of-the-art obtains only 69.5% and 63.0%.