Artificial intelligence company DeepMind has built a tool capable of creating working code to solve complex software problems
February 2, 2022
DeepMind, a UK-based artificial intelligence company, has taught some of its computer software typewriters – and it performs almost as well as an average human programmer when judged in competition.
The new AlphaCode system is claimed by DeepMind to be capable of solving software problems that require a combination of logic, critical thinking and the ability to understand natural language. The tool was entered in 10 rounds on the Codeforces programming contest website, where human participants test their coding skills. In those 10 rounds, AlphaCode placed roughly level with the median competitor. DeepMind claims this is the first time an AI code writing system has achieved a competitive level of performance in programming competitions.
AlphaCode was created by training a neural network on numerous coding samples, sourced from the GitHub software repository and previous contestants on Codeforces. When confronted with a new problem, he creates a huge number of solutions in C++ and Python programming languages. It then filters them and ranks them in a top 10. When AlphaCode was tested in competition, humans evaluated these solutions and submitted the best of them.
Code generation is a particularly tricky problem for AI because it’s difficult to gauge how close a particular output is to success. Code that crashes and therefore fails to achieve its goal may be one character away from a perfectly working solution, and several working solutions may appear drastically different. Solving programming contests also requires AI to extract meaning from a problem description written in English.
Microsoft-owned GitHub last year created a similar but more limited tool called Copilot. Millions of people use GitHub to share source code and organize software projects. Copilot took this code and trained a neural network with it, allowing it to solve similar programming problems.
But the tool was controversial because many claimed it could directly plagiarize this training data. Armin Ronacher of software company Sentry discovered that it was possible to invite Copilot to suggest copyrighted code from the 1999 computer game. Quake III Arena, with comments from the original programmer. This code cannot be reused without permission.
When Copilot launched, GitHub said that about 0.1% of its code suggestions may contain “a few snippets” of textual source code from the training set. The company also warned that it is possible for Copilot to produce genuine personal data such as phone numbers, email addresses or names, and that product code may offer “biased, discriminatory, abusive or offensive” or include security vulnerabilities. He says the code should be checked and tested before use.
AlphaCode, like Copilot, was first trained on publicly available code hosted on GitHub. It was then refined on the programming contest code. DeepMind says AlphaCode does not copy code from previous examples. In view of the examples DeepMind supplied in its preprinted paperit seems to fix problems while only copying a little more code from training data than humans already do, says Riza Theresa Batista-Navarro at the University of Manchester, UK.
But AlphaCode seems to have been so finely tuned to solve complex challenges that prior state-of-the-art AI coding tools can still outperform it on simpler tasks, she says.
“What I’ve noticed is that while AlphaCode is able to do better than cutting-edge AIs like GPT on competitive challenges, it does relatively poorly on introductory challenges,” says Batista. -Navarro. “The assumption is that they wanted to do competition-level programming problems, to tackle more difficult programming problems rather than introductory problems. But it seems to show that the model was so well honed on the more complicated issues that somehow he kind of forgot about the introductory level issues.
DeepMind was not available for an interview, but DeepMind’s Oriol Vinyals said in a statement, “I did not expect ML [machine learning] reach a human average among competitors. However, this indicates that there is still work to be done to reach the level of the highest performers and to advance the problem-solving capabilities of our AI systems.
Learn more about these topics: