The State of AI in the Software Development Lifecycle

AI can’t replace human developers yet, but it can make code suggestions and facilitate software testing, deployment, and management.

With tech skills in short supply and businesses demanding that applications be delivered more quickly, developers are turning to artificial intelligence and machine learning (AI/ML) to automate parts of the software development lifecycle. Most notably, AI can make code suggestions and facilitate software testing, deployment, operations and maintenance.
For example, auto-complete tools that suggest code can be surprisingly accurate, say developers. Today AI is most useful in later stages of the development process such as testing, deployment, and management. These are the stages where the data used to train an AI application are more accurate, and where less human insight and creativity are required.
While AI can make the lives of developers, testers and operations people easier by automating some tasks, it’s unlikely to replace those jobs in the foreseeable future. This is especially true for work in early-stage planning and design, where developers need to understand and gain insights into the needs of the people who will use the software. Code suggestion tools are effective for simple coding tasks, but less so for complex ones. Likewise, while AI is effective in testing and verifying the security of applications, today it works best for simple test automations. And in the MLOps space, AI can help by suggesting operational fixes but is less effective at resolving them.
Planning and Requirements
Defining the scope of a development project and defining requirements—such as the ability to check inventory levels or place an order—are the essential first steps in the development cycle. Currently, AI can only play a minor role in this step, according to developers, because every application has unique characteristics, and defining its scope requires understanding the needs of human users.
“This is not an area where AI is particularly helpful right now,” said Cameron Sechrist, founder of Streamlined, an AI-powered marketing and business operations platform. “Ideally, the best software is defined by the humans who use it,” not by an algorithm. While cutting-edge research uses tools such as sentiment analysis that lets users or developers describe what they need in plain language, “we are not there yet,” he said, as far as the algorithm producing the required code.
One promising use of AI is to validate the requirements developers use to create applications, said Sechrist. Such validation might ensure, for example, that each requirement identifies who will use each function and the specific action they will take using that function. However, there are currently no widely available products that provide such validation, he said.
The earlier stages of software development—planning, requirements, and design—belong to the notoriously difficult class of ML problems that require reasoning and insight. Those “may be the last to receive a significant, AI-powered treatment,” said Steven Hao, a software engineer at AI data management vendor Scale AI.
Software Development and Coding
While developers are intrigued by AI-based tools that suggest code, most are not yet ready to trust the code they produce, especially for complex operations.
Tools such as Microsoft’s IntelliSense and GitHub’s Copilot do a great job of speeding up a developer’s workflow and are widely used to efficiently make edits to source code, said Hao. Classical language services such as IntelliSense parse your input source code to understand language semantics—”not unlike a code compiler,” he said. “Copilot and other AI pair programming software use language models trained on open source code to go beyond the syntax rules for a programming language, going deeper and understanding stylistic conventions and programmer intent as well.”
Even while in technical preview, Copilot (which will move to general availability this summer) already suggests about 35% of the code in popular languages such as Java and Python, according to a Microsoft blog post. As such tools evolve, they will increasingly allow developers to use natural language commands to generate code. This increases their productivity while allowing a wider, more diverse audience of users to create applications to meet their needs.
Using Codex, an ML model from AI research and development company OpenAI, a developer “working in the graphics rendering engine Babylon.js entered the natural language command ‘create a model of the solar system’ into the text box and the AI-powered software translated the command into code for a solar system model,” the post said.
Peter Welinder, vice president of products and partnerships at OpenAI, said Copilot also allows developers to automatically document applications as they develop them. “You get a bunch of comments in the code just from the nature of telling Copilot what to do,” he said. “You’re documenting the code as you go.”
Microsoft has integrated the GPT-3, OpenAI’s natural language mode used in Codex, with Microsoft Power Apps to allow users to create applications using conversational language. It plans to leverage AI models to turn drawings, images, PDFs, and design files into software applications.
Julia Liuson, president of the developer division at Microsoft, envisions AI-powered models and tools that will help developers of all ability levels clean data, check code for errors, debug programs, and explain what blocks of code mean in natural language.
Nick Hodges, developer advocate at Rollbar, which provides AI-assisted developer workflows and error detection, said he found Copilot’s suggestions so accurate that it “freaked me out.” It’s extremely prescient, he added, at times suggesting code he hadn’t realized he was about to type. However, he didn’t always take its suggestions, especially for more complex code. “I didn’t feel I could 100% trust it because I didn’t know where it came from. It just produced code someone else wrote in a very similar situation.”
Copilot “is pretty far from ideal,” said Streamlined’s Sechrist. “I don’t use it in my day-to-day workflow.” While Copilot has the potential of analyzing large code bases to suggest code, it is limited in its ability to gather and analyze enough code to produce answers to complex coding challenges.
“I think it could be very useful to help people learn how to code for lightweight, front-end coding such as with HTML or CSS,” said Sechrist. “For more advanced back-end work, I don’t know if it will ever truly be innovative enough to be able to adapt to the uniqueness of every codebase.”
Sechrist said Copilot will be less useful for complex infrastructure that must support very high transaction loads and multiple connections with platforms and databases, Including different APIs and data schemas. He doesn’t know if it’s possible that Copilot “could take every edge case into account and every vulnerability and security flaw.”
In addition, he said, development tools change so quickly that “by the time you train a model, there is a new version of the development tool out there.”
GitHub did not respond to requests for comment, but on its website it acknowledges that, while Copilot “tries to understand your intent and to generate the best code it can … the code it suggests may not always work, or even make sense” and should “be carefully tested, reviewed, and vetted, like any other code.”
Testing
Given the high potential for automation, testing is another area where AI promises major savings in time and effort.
AI is most commonly used in test automation and test data design and generation, said Lucas Bonatto Miguel, founder and CEO of Elemeno, a platform that helps data scientists build highly scalable software infrastructure. AI can also help create test data that is more representative of real-world data, he said, which can make tests more accurate and help generate large amounts of test data quickly—a feature that’s especially useful when testing complex applications.
One of AI’s most important benefits lies in helping testers find bugs that would otherwise be difficult or impossible to detect, he said, reducing test processes by automating repetitive tasks and improving test results by providing more accurate and detailed feedback.
Among the limits of AI in software testing, Miguel said: It still requires large amounts of data to learn from to produce accurate results. Furthermore, AI algorithms need further development to move beyond automating simple tasks to being able to handle more complex tests that today still require human involvement, he added.
Streamlined’s Sechrist has not used AI to generate test cases for unit testing or integration testing, but he relies on the real-time scanning and AI semantic analysis of the Snyk security platform “to help me decide which test cases to write” as it generates flags about potential security or other issues in his code.
AI is going to be “very critical” for testing and verifying the security of applications, he said.
Deployment, Operations, Maintenance, and Security
The abundance of data—in the form of alerts, logs, and other information about applications and infrastructure—makes maintenance, optimization, and security among the most well-known uses of AI in the development lifecycle. Often called MLOps, the fast analysis of vast amounts of operational data can focus IT staff on the most critical alerts, reduce the time needed to identify problems and recommend fixes, and suggest ways to get the best price and performance from complex environments.
However, it does require the collection and management of large amounts of data, and at least for now is better at suggesting fixes (or following established rules for such remediation) than taking corrective action itself, developers said.
AI has been “very useful for us” in this area, said Streamlined’s Sechrist, automatically alerting him to attacks or to performance issues that require the provisioning of new resources in the cloud without human workers having to constantly monitor performance or security dashboards. AI is more effective in this area, he said, because the algorithms can be trained on readily accessible and well-understood security and performance data that companies have been gathering for years. Another advantage is that this data is relatively easy to use because it requires less security protection than does customer data, he said.
Security monitoring “is the most mature area for AI because those patterns tend to be more immutable, less apt to change, once we figure them out,” said Nima Badley, vice president of global alliance at GitLab, creator of the DevOps platform of the same name. Building an AI tool that can catch “all the different possible mistakes a developer could make” requires analyzing far more data, from far more sources, than is often available.
Such data collection is also often hobbled by “a hodgepodge of tools that don’t share data very well and don’t have common metadata,” he said. An AI bot that can finish a piece of code is “less interesting” than using AI to test the completed code, understand its dependency on various platforms, and alert people about whether the code exposes the application to attack, he said.
Currently, AI is more effective at monitoring environments than it is at providing automatic responses to problems, Rollbar’s Hodges said. “Right now, I don’t think the system is capable of writing the rules. It’s capable of following the rules.”
Best Practices
Developers can best use AI by understanding the capabilities and limitations of each tool, said Scale AI’s Hao. “Code completion aids that rigorously understand a language’s syntax “are great for making systematic changes to existing codebases, large-scale refactoring tasks like moving code between modules, or renaming variables.” ML-powered pair programmers can help developers learn new languages and frameworks due to their ability to suggest idiomatic code completions. “At best, these code completions work out of the box,” he said; “at worst, they serve as boilerplate that can be adapted.”
When using AI for testing, Elemeno’s Miguel recommends clearly defining the test objectives, identifying the right data set to train the AI model, and working with domain experts to ensure the test is designed correctly. He warns against over relying on AI without human oversight, testing, and validating the AI models and understanding how they produce their recommendations.
“I personally don’t think AI will surpass humans in math or programming competitions in this decade,” said Hao. “While I expect coding, testing, and maintenance to become more and more the job of machines, software design and architecture will remain the domain of humans for now.”