Links: Research Paper, Github
Can you imagine a machine writing an app for you, just by telling it what you want?
As futuristic as this scenario sounds, it’s actually here today.
Salesforce AI Research outlines conversational AI programming as a new paradigm that’s making this vision a reality, thanks to an AI system that writes software with you, in a conversation.
The first step towards this vision is now here in the form of our large-scale language model, CodeGen, which turns simple English prompts into executable code. You don’t write any code yourself; instead, you describe what the code should do, in natural language -- and the machine writes it for you.
For a quick look at how it works, let’s ask CodeGen to solve the two-sum problem: find two integers in a list that add up to a certain number.
To begin, we simply prompt the model, in plain English, to solve the two-sum problem. As you can see in the brief video below, our CodeGen model generates functioning code that solves the problem correctly.
Before proceeding further, let’s define some of the terms and ideas used in this blog:
Programming or Coding: A multi-step process designed to get a machine to achieve a goal:
Conversational AI: Technologies enabling natural interactions between a human and a computer, via a conversation conducted in the human's native language.
Types of Computer Programming
While software engineering concepts and methodologies have evolved over the past few decades (programming languages, web services, cloud computing, and so forth), the classical paradigm in which one writes the code (the underlying building block of software) has remained mostly untouched for decades. Since our research proposes a new way to do programming, it’s instructive to see how it compares to other means of doing programming:
Up till now, we have had two ways to get computers to do useful work:
Option 1 is great, when the computer programs you need are available.
But Option 2 has a built-in barrier: if the type of program you need does not exist, the task of creating new programs has always been limited to those who can speak the computer’s language; you must learn at least one programming language and apply that knowledge to write programs. In other words, to get new programs, you have to know how to translate what you want to do into computerese, so the computer will understand what you want it to do. This bottleneck applies not only to situations where you want to create a program for yourself, but also when you want to create programs for others - in coding jobs, for instance.
Here are three of the major limitations of the current programming paradigm:
These factors often hinder or discourage the education and development of new programmers, especially among people in historically disadvantaged groups. In other words, traditional programming often presents people with a different kind of "coding problem" -- not one given on a test, but rather a formidable real-world obstacle that many simply cannot solve.
However, the good news is that there is another way.
What if you could just tell a machine the kind of program you need - just use your native language to describe your needs to a computer, and it would generate the code that does what you wanted? That’s the amazing promise of conversational AI programming: CodeGen makes programming as easy as talking.
Here’s an analogy to help illustrate the concept. When you order dinner in a restaurant, instead of having to know the correct ingredients to make your desired dish, and then cooking it yourself, you just tell the server what you want, and they prepare and bring it to you. Say the dish you want in a short sentence, and you get it without any involvement from you in the creation of the meal - no need to specify any ingredients or cooking steps, and you don’t need to know any special culinary terms. The restaurant acts like an intelligent system, translating your plain-English request (order) into a sequence of steps that takes basic food ingredients and generates the outcome (cooked dish) you asked for. Now imagine you’re “ordering” computer code instead of a meal, and you have the basic idea behind CodeGen.
Our implementation of conversational AI programming provides a glimpse into the future of democratizing software engineering for the masses. An “AI assistant” translates English descriptions into functional and executable Python code - allowing anyone to write code, even if one knows nothing about programming. The underlying language model, CodeGen, enables this conversational paradigm and will be made available as open source to accelerate research.
This new paradigm in programming takes the form of a simple yet highly intelligent dialogue. In the concept’s full implementation (our vision of how it would work in its ultimate form), a typical fully-interactive conversation about your desired code would flow as follows:
While the above (fictional) conversation example helps illustrate the full conceptual vision, let’s turn to some real-world examples - the concrete realization of the concept as it exists today in CodeGen. Let's start by revisiting the two-sum problem we introduced at the start of this blog:
Note that this time, we don’t stop once CodeGen generates working code to solve the problem - we ask the model to try again, and solve the problem using a hash map. This example illustrates some of the groundbreaking capabilities of our system: we can continue our conversation, refer back to “the problem” (a backreference, which CodeGen understands), and request that the model try a new approach (the hash map), in the hopes of getting an even better solution. (In our restaurant analogy, this would be like giving additional instructions to the server about your order, like “use egg whites only” or “use margarine instead of butter.”)
And it works: CodeGen succeeds in generating new code that uses a hash map, and in so doing, the new solution runs in linear time - O(n) - much faster than the original solution, which was O(n**2).
The above “hash map” example illustrates a key aspect of CodeGen: while anyone can use CodeGen to build software from scratch, even non-coders, it does help to have some programming knowledge in certain cases. For example, knowing coding concepts can help you think of followup commands to give CodeGen, suggesting new avenues to explore while building the code (like using hash maps, or recursion - or not using these techniques).
While the vision is to create optimal programs for any problem by just telling the machine what you want, without needing any coding knowledge, the reality is that some programming knowledge can often help, in order to guide CodeGen to a good solution. This is especially true for more complex problems, where having the user suggest different approaches may help the software find a working solution - or a more efficient one.
Still, even for experienced coders, CodeGen makes getting to a functioning solution faster and easier, and allows rapid exploration of alternate methods. In other words, CodeGen is beneficial for all levels of programmers.
Approach. Salesforce AI Research trained CodeGen, a 16-billion parameter auto-regressive language model, on a large corpus of natural and programming languages. Two aspects are of particular interest: (1) sampling executable code by scaling the size of the model and dataset, (2) emergence of conversational capabilities.
Scaling. The large size of this model is motivated by the empirical observation that scaling the number of model parameters proportional to the number of training samples appears to strictly improve the performance of the model. The phenomenon is known as the scaling law. We leverage this law to learn a model which can translate a natural language (English) to a programming language (Code) with high accuracy. That is, the model is capable of not only generating reasonable code, but also executable code; the generated code is of such high quality that it can be immediately executed without revisions by a programmer, which allows even a non-professional audience to “write” code.
Conversation. Having a conversation appears a rather trivial task for humans. We implicitly keep track (or a memory) of the past conversation, resolve references to previously mentioned elements, and incrementally build a mental picture or story of the discourse. For machines, holding a realistic conversation is one of the grand challenges of our time. Testing if a machine possesses human capabilities or can fool a human into believing she or he is holding a conversation with another human being is known as the Turing Test. While in the first iteration of our research, the model replies in a formal language (i.e., the programming language) and not a natural language, later incarnations will be in the form of a multi-turn discourse in natural language, so that the model may resolve ambiguities as in “May I solve this problem with algorithm A, B, or C?”. Surprisingly, modeling such conversation in conjunction with the scaling laws turned out to be rather simple, where simplicity is a rather desirable property (see Rich Sutton’s “The Bitter Lesson”).
Specifically, a conversation of several consecutive questions (by the human in natural language) and answer (by the machine in programming language) is concatenated into a single long sequence. Based on this context of the past conversation, an auto-regressive decoder model samples the next response conditional on the past pairs of questions and answers. The fact that conversational capabilities emerge with such a naively simple approach (given sufficient data and model size) was surprising.
Picture the example shown earlier, in which first a problem is stated and subsequently the question (or specification) is refined:
“Solve the two sum problem”
“Solve the problem using a hash map”
While solving the first request can be understood as a form of pseudo-retrieval of examples in the observed training data (think of a database query), the second request involves resolving the backreference of “the problem” to “the two sum problem”, and requires a shallow form of understanding the previously generated code to rewrite it using a hashmap. This phenomenon is crucial as the underlying model was never specifically trained to hold a conversation or revise code. These conversational and problem solving capabilities “emerged” naturally.
Benefits for Next-Gen Software Development: Programs of the Future Need This
While programming is a useful skill today, in the next decade programming will be a necessity in many tech jobs, including at Salesforce. The world needs more and more code, in every aspect of society, and these programs are getting increasingly complex. Hence, systems like CodeGen (which help speed up the programming process while making it easier and more manageable) should play an integral role in completing increasingly large coding projects, as well as bringing a whole new generation of programmers into the world of coding to achieve these goals.
But there is another issue on the horizon: what happens when future programming needs become so complex that the skills needed to create these programs outstrip human capabilities? Digital ecosystems are evolving into systems with ever-increasing functional complexity, and at some point these systems’ complexity may increase beyond our capacity to understand them, let alone build them. We may soon get to the point where projects require technology such as conversational AI programming, in order to create the mega-complex software systems of the future -- both on the massive scale that will be required, and in a timeframe that would be impossible for a team of human programmers to produce on their own.
In short, rapidly increasing code complexity requires a new paradigm. Next-gen programming needs, both at Salesforce and at other organizations, seem destined to make conversational AI programming systems like CodeGen essential to our future.
Benefits for Society: CodeGen Democratizes Programming
A major part of Salesforce’s mission is to develop technology that can help all of society, not just the company, and that is exactly what this research does. Many groups will benefit from the conversational AI programming revolution that CodeGen represents. Here are some examples.
Enhancing equality and equity. Opening up coding to all - democratizing access to the world of creating programs - will help bring traditionally disadvantaged groups into the world of programming, leading to increased career opportunities and incomes for such groups.
Education/teaching/learning. Kids will learn to program interactively with “AI teachers” as their companions, create worlds and games in a discourse in their native language, while learning and absorbing how to translate their ideas into programming languages.
Software professionals: engineers, data scientists, developers. Software engineers will understand the architecture, design patterns, and summarize critical paths of legacy systems with the aid of “AI assistants”. Analysis of complexity in space and time, security vulnerabilities, design patterns, refactorings, or test generations is supported by an artificial pair-programmer.
Non-software professionals. Business analysts will integrate complex external data sources and systems, correlate and normalize data, perform exploratory analysis, and visualize findings in conjunction with “AI analysts”.
In general, the democratization of coding should reap society-wide rewards.
Salesforce AI Research invites you to dive deeper into the concepts discussed in this blog post (links below). Connect with us on social media and our website to get regular updates on this and other research projects.
Erik Nijkamp is a Research Scientist at Salesforce AI Research. His research emphasis is on large-scale generative models and representation learning with applications in NLP and computer vision. Prior to Salesforce, he was a PhD student under Prof. Song-Chun Zhu and Prof. Ying Nian Wu at UCLA.
Donald Rose is a Technical Writer at Salesforce AI Research. He works on writing and editing blog posts, video scripts, media/PR material, and other content, as well as helping researchers transform their work into publications geared towards a wider (less technical) audience.