IBM is releasing a family of AI agents (IBM SWE-Agent 1.0) that are powered by open LLMs and can resolve GitHub issues automatically, freeing up developers to work on other things rather than getting bogged down by their backlog of bugs that need fixing.
“For most software developers, every day starts with where the last one left off. Trawling through the backlog of issues on GitHub you didn’t deal with the day before, you’re triaging which ones you can fix quickly, which will take more time, and which ones you really don’t know what to do with yet. You might have 30 issues in your backlog and know you only have time to tackle 10,” IBM wrote in a blog post. This new family of agents aims to alleviate this burden and shorten the time developers are spending on these tasks.
One of the agents is a localization agent that can find the file and line of code that is causing an error. According to IBM, the process of finding the correct line of code related to a bug report can be a time-consuming process for developers, and now they’ll be able to tag the bug report they’re working on in GitHub with “ibm-swe-agent-1.0” and the agent will work to find the code.
Once found, the agent suggests a fix that the developer could implement. At that point the developer could either fix the issue themselves or enlist the help of other SWE agents for further assistants.
Other agents in the SWE family include one that edits lines of code based on developer requests and one that can be used to develop and execute tests. All of the SWE agents can be invoked directly from within GitHub.
According to IBM’s early testing, these agents can localize and fix problems in less than five minutes and have a 23.7% success rate on SWE-bench tests, a benchmark that tests an AI system’s ability to solve GitHub issues.
IBM explained that it set out to create SWE agents as an alternative to other competitors who use large frontier models, which tend to cost more. “Our goal was to build IBM SWE-Agent for enterprises who want a cost efficient SWE agent to run wherever their code resides — even behind your firewall — while still being performant,” said Ruchir Puri, chief scientist at IBM Research.