/* ---- Google Analytics Code Below */

Monday, April 11, 2022

Microsoft Fixing Bugs in Machine Written Software

Broadly useful? What kind of bugs?   If code has been written by Machine:

Jigsaw fixes bugs in machine-written software

Published March 31, 2022

By Rahul Sharma , Principal Researcher  Suresh Parthasarathy , Principal Research Developer  Sriram Rajamani , Distinguished Scientist and Managing Director, Microsoft Research India Lab  Nagarajan Natarajan , Senior Researcher  Arun Iyer , Principal Researcher

Large pre-trained language models such as GPT-3, Codex, and others can be tuned to generate code from natural language specifications of programmer intent. Such automated models have the potential to improve productivity for every programmer in the world. But since the models can struggle to understand program semantics, the quality of the resulting code can’t be guaranteed.

PUBLICATION

Jigsaw: Large Language Models meet Program Synthesis 

In our research paper, Jigsaw: Large Language Models meet Program Synthesis , which has been accepted at the International Conference on Software Engineering (ICSE 2022), we introduce a new tool that can improve the performance of these large language models. Jigsaw deploys post-processing techniques that understand the programs’ syntax and semantics and then leverages user feedback to improve future performance. Jigsaw is designed to synthesize code for Python Pandas API using multi-modal inputs.

Our experience suggests that as these large language models evolve for synthesizing code from intent, Jigsaw can play an important role in improving the accuracy of the systems.

The promise, and perils, of machine-written software

Large language models like OpenAI’s Codex are redefining the landscape of programming. A software developer, while solving a programming task, can provide a description in English for an intended code fragment and Codex can synthesize the intended code in languages like Python or JavaScript. However, the synthesized code might be incorrect and might even fail to compile or run. Codex users are responsible for vetting the code before using it. With Project Jigsaw, we aim to automate some of this vetting to boost the productivity of developers who are using large language models like Codex for code synthesis.

PROJECT Jigsaw

Suppose Codex provides a code fragment to a software developer. The developer might then undertake a basic vetting by checking whether the code compiles. If it doesn’t compile, then the developer might be able to use the error messages of the compiler to repair it. Once the code eventually does compile, a typical developer will test it on an input to check whether the code is producing the intended output or not. Again, the code might fail (raise an exception or produce incorrect output) and the developer would need to repair it further. We show that this process can be completely automated. Jigsaw takes as input an English description of the intended code, as well as an I/O example. In this way, it pairs an input with the associated output, and provides the quality assurance that the output Python code will compile and generate the intended output on the provided input.

In our ICSE 2022 paper, Jigsaw: Large Language Models meet Program Synthesis, we evaluate this approach on Python Pandas. Pandas is a widely used API in data science, with hundreds of functions for manipulating dataframes, or tables with rows and columns. Instead of asking a developer to memorize the usage of all these functions, an arguably better approach is to use Jigsaw.   .... ' 

No comments: