We are proud and excited to announce this year’s winners of the Salesforce Research Deep Learning Grant. Each of our five winners will receive a $50K grant to advance their work using deep learning methods to help us shape the future of AI.
We were incredibly impressed by the quality and variety of proposals that were submitted. We received double the amount of applications from last year, representing countries from across the globe. After committee reviews and discussion from our panel of experts, we’ve selected the five winners below based on the quality of their proposals, novelty of the idea and relevance to the research topics we proposed.
We look forward to building lasting relationships with our grant winners, and to working together to advance the state-of-the-art in AI.
Chenhao Tan, University of Colorado Boulder
AI for Good, De-biasing Deep Learning Models, Explainable AI (XAI), Natural Language Processing
Active Learning with Explanations for De-biasing NLP Models
When a model makes a mistake, it ought to be possible to understand the reasoning underlying that mistake and make corrections at the level of that reasoning. Frequently-discussed issues like implicit model bias and vulnerability to adversarial attacks can be described as issues of attribution — models assigning spurious association to input tokens. Explainable AI offers methods for describing these lapses in reasoning, but there exists no method for correcting them efficiently. To address this gap, we propose to develop an active learning approach by selectively soliciting corrections to model reasoning at the token level.
Our approach combines insights from explainable AI and active learning to 1) efficiently sample useful examples, 2) explain model reasoning to users, 3) solicit alternative explanations via token-level labels, and 4) incorporate those corrections back into the model as additional training signal. The proposed work will enable a way to refine NLP models more directly and efficiently than traditional methods.
Christopher Ré, Stanford University
De-biasing Deep Learning Models, Lifelong Learning, Data Augmentation and Robustness
Muscaria: Model Patching with Weakly Supervised Data Augmentation
Data augmentation is a common technique for dramatically improving the generalization of machine learning models and has been of paramount importance to the success of computer vision. Augmentation strategies in use today have been developed as a consequence of trial and error over many years of practice, and perform generic, rather than semantic manipulations to the data. The goal of our research is to enable users to perform model patching: improving deployed models using customized, complex and semantic data augmentation pipelines spun up with weak user supervision. This work will build on our lab’s expertise in developing theory, algorithms, and systems for incorporating weak supervision into machine learning models over the last few years.
Greg Durrett, University of Texas at Austin
Natural Language Processing
Exerting Fine-Grained Content Control for Summarization
While recent neural summarization methods have achieved strong performance on standard datasets, saturating ROUGE scores in settings like CNN/Daily Mail, summarization as a whole is still far from solved. A summarization system should be usable by a range of users with different information needs across a variety of text domains, which is not the case for current systems. We propose a summarization model where users can exert fine-grained control over both content selection and generation, manipulating the system's behavior on a particular instance and, more importantly, being able to declaratively specify how to produce a summary. Our model decouples the content selection and generation processes, selecting content at a sub-sentence level with what we call an information skeleton, then generating the summary abstractively. The user can express preferences via key phrases about what information is selected in the skeleton. We propose several targeted evaluations to establish our model's overall effectiveness and controllability in our settings of interest.
Hung-yi Lee, National Taiwan University
Lifelong Learning, Multi-Task Learning, Natural Language Processing, DecaNLP
Lifelong Language Learning
The goal of this proposal is to investigate and build a thorough understanding of lifelong language learning (LLL). This project includes: (1) Extending LAMAL, our previous work on LLL, on the entire DecaNLP dataset. (2) Quantifying how semi-multitask guides LLL. (3) Obtaining a theoretical view of LLL in terms of neural network robustness and task relatedness. Eventually, we hope to realize real-world applications based on our research.
Pulkit Agrawal, Massachusetts Institute of Technology
Few-Shot Learning for Visual Recognition, Lifelong Learning, Multi-Task Learning
A Unified Framework for Continual and Few-Shot Learning
Current research has treated lifelong learning and transfer for few-shot recognition as separate problems. In lifelong learning, significant research effort is concerned with overcoming catastrophic forgetting by defining different ways to constrain weight updates. In transfer learning, meta-learning based methods have gained dominance in the past few years. Instead of treating these as separate problems, using the idea of weight superposition, we propose a unified framework that overcomes the problem of catastrophic forgetting allowing an agent to learn in a lifelong fashion by composing models learned for previous tasks. Because, solutions to new tasks are determined by composing previously learned models, we expect our system to outperform the current dominant paradigm of updating weights using finetuning in few-shot scenarios.