Science Advances Publishes AI Economist Research on Improving Tax Policies With Reinforcement Learning

7 min read

TL;DR: The AI Economist, a reinforcement learning (RL) system, learns dynamic tax policies that optimize equality along with productivity in simulated economies, outperforming alternative tax systems. We have now expanded this research, which is being published in the interdisciplinary scientific journal Science Advances.

Humans or AI: Which Can Design More Effective Tax Policy?

If you had a choice, which would you trust more to design an effective tax policy: humans (as in our government), or Artificial Intelligence? Before you answer, think of all the nuances, complexities, and mountains of data that would be needed to design a truly optimal tax system – one that's equitable, fair, encouraging equality while also maximizing productivity. Now which would you choose?

Turns out a reinforcement learning (RL) system called the AI Economist, which we developed, has already answered this question. And the answer is: AI does it better, when it comes to optimal tax policy design. Read on to find out why.

Note that, in this context, AI is meant to augment human decision-making, not supplant it. This is especially important in the context of public policy. A system like the AI Economist would ideally be used by government officials to help them design optimal tax policies that benefit the most people. Also note that the AI system doesn't set the goal. It's given a goal by humans, and can then recommend the best policy for that goal.

In short, we believe that the best strategy for creating tax policy would be humans and AI working together, each applying their unique strengths to maximize the public good.

The Problem: Designing Optimal Tax Policies is Hard

Economic inequality is accelerating globally and has negative impacts on economic opportunity, health, and social welfare. Taxes are important government tools that, when designed and applied in the right way, can work to reduce inequality.

However, designing taxes that optimize social welfare (for example, equality along with productivity) is difficult in the face of the complexity of today’s economy. Case in point: taxpayers may change their behavior in response to a change in taxes – they may work more, or work less, or may move to another state – and such behavioral changes are difficult to predict at an individual level.

Current economic methodologies are often limited in addressing these issues. For instance, there is not enough empirical data to support fine-grained models of human behavior, and economic models often make simplifying assumptions that are unrealistic, such as the assumption that agents do not trade or interact in some way. This is where AI can help.

Our Solution: The AI Economist

To address the problem of designing optimal taxes that balance equality with productivity, we developed the AI Economist. Our system brings the powerful techniques of reinforcement learning to bear on tax policy design for the first time to provide a purely simulation- and data-driven solution.

In our initial release of the AI Economist, we showed that AI can improve the tradeoff between equality and productivity by 16% in a simulated economy - a significant increase. This AI solution outperforms a prominent tax framework proposed by Emmanuel Saez, with even larger gains over an adaptation of the US Federal income tax and the free market. (For more information on that work, please see our previous blog.)

We have now expanded this research, and are proud to announce that our expanded work is being published in Science Advances, a peer-reviewed interdisciplinary scientific journal published under the Science brand. (Science, also known as Science Magazine, is the prestigious academic journal of the American Association for the Advancement of Science.)

Our extensive analysis and experimental results, obtained from running the AI policy in the simulator, give a solid indication that AI-designed policies are more effective than traditional economic methods.

Deep Dive: How Our Approach Works

The AI Economist uses two-level deep reinforcement learning to design tax policies. This framework uses a simulation with economic agents and a government whose behaviors are modeled using deep RL. Using continuous experimentation and adaptation, RL then finds policies that achieve a higher degree of social welfare compared to baselines such as progressive, regressive, or no taxes.

As a key contribution, we describe a practical approach using learning curricula to stabilize the two-level RL process, which is intrinsically unstable.

Our AI-driven solution is more flexible than economic theory, which cannot handle the complexities of the real world. The AI Economist is effective in dynamic economies that change over time, can directly optimize for any socio-economic objective, and learns from observable data alone.

In contrast, economic theory often relies on simplifying assumptions that are hard to validate – for example, about the effect of taxes on how much people work or what the government knows about taxpayer preferences.

Experimental Findings

In our publication, we show through extensive experiments that our two-level RL solution is trustworthy and more effective than traditional economic methods. Here are some highlights:

Interactions matter. Our work shows that interactions strongly affect what tax policy is best. For example, a resource buyer may buy less if taxes are high, and this can affect how much profit the sellers can make. Such second-order effects are hard to model and take into account with traditional economic tools. Instead, traditional economics often assumes that people are independent agents who don't talk or cooperate with each other. AI allows us to optimize policies based on a more detailed representation of the world.

Robust results. We tested the AI solution in more varied simulations, with different layouts, different distributions of starting locations and resources, etc. Overall, it showed that our AI solution did better across the board compared to several baselines. Along several key metrics, including utility and a combined measure of equality and productivity, the AI Economist's results beat alternative tax systems (the Free Market, an adaptation of the US Federal income tax, and the Saez Formula), as the figure below shows.

AI matches theory. In a simpler economy with a single time step, the AI Economist can rediscover optimal policies that were previously found using mathematical analysis. This builds trust that our two-level RL strategy can find solutions that economists already know.

Flexibility. We show that two-level RL can optimize policies for two different social welfare objectives – specifically (1) maximizing equality and productivity, and (2) maximizing the sum of all individual agent utility. This shows that our AI approach is flexible and not constrained to analytically tractable objectives.

More agents. Our simulation used 10 agents, up from four before. This suggests that our approach can be effective in larger economies. Future work and engineering efforts may enable and test AI solutions in such larger simulations.

Impact on Society, the Economy, and Government

Our work shows that AI (and, more specifically, economic simulations with AI agents) is a powerful platform for economic analysis and design. Future policymaking could be informed by AI-driven simulations and explainable AI policies. For example, AI systems could explain the tradeoffs of using different policies when taking the detailed mechanics of the economy into account, such as the effects of inflation.

The overall impact: AI can help government officials make better-informed policy decisions in the face of ever-growing economic complexity, and also help them explain the reasoning behind those policy decisions to other officials and the general public.

The Bottom Line

Our vision for the AI Economist is to enable an objective study of policy impact on real-world economies, at a level of complexity that traditional economic research cannot easily address. We believe the intersection of machine learning and economics presents a wide range of interesting research directions, and gives ample opportunity for machine learning to have a positive social impact.

For all its success thus far, this work still represents an initial step. Economics has yet to adopt AI techniques more broadly. However, other areas of AI went through similar adoption timelines, with initial hesitancy (or lack of awareness) eventually changing to more widespread acceptance and applications. As we continue to expand our research efforts, we hope our results inspire the community to study and deploy AI techniques in more areas of economics in the future.

Explore More

Salesforce AI Research invites you to dive deeper into the concepts discussed in this blog post (links below). Connect with us on social media and our website to get regular updates on this and other research projects.

About the Authors

Stephan Zheng leads the AI Economist team at Salesforce Research, which works on deep reinforcement learning and AI simulations to design economic policy. His work has appeared in Science Advances and has been widely covered in the media, including the Financial Times, Axios, Forbes, Zeit, Volkskrant, MIT Tech Review, and others. He holds a Ph.D. in Physics from Caltech (2018) and interned with Google Research and Google Brain. Before machine learning, he studied mathematics and theoretical physics at the University of Cambridge, Harvard University, and Utrecht University. He received the Dutch Lorentz Graduation Prize for his thesis on topological string theory and was twice awarded the Dutch Huygens Scholarship.

Donald Rose is a Technical Writer at Salesforce AI Research. Specializing in content creation and editing, Dr. Rose works on multiple projects, including blog posts, video scripts, news articles, media/PR material, social media, writing workshops, and more. He also helps researchers transform their work into publications geared towards a wider audience.


This work was a joint effort with contributions from Stephan Zheng, Alex Trott, Sunil Srinivasa, David Parkes, and Richard Socher. We also wish to thank Nikhil Naik, Melvin Gruesbeck, and Kathy Baxter.

Additional thanks to Lofred Madzou, Simon Chesterman, Rob Reich, Mia de Kuijper, Scott Kominers, Gabriel Kriendler, Stefanie Stantcheva, and Thomas Piketty for invaluable discussions.