AUTHORS: Sharvin Shah, Jin Qu, Donald Rose
TL;DR: TaiChi is an open source library for few-shot NLP, designed for data scientists and software engineers who want to get some quick results or build proof-of-concept products but don’t have much experience with few-shot learning (FSL). The library abstracts complex FSL methods into Python objects that can be accessed through one or two lines of code, greatly reducing the hurdle to learn and use the latest FSL methods.
Tai Chi, well known as a Chinese martial art, emphasizes practicing "smart strength" like the leverage of joints to gain great power with minimal effort.
Interestingly, this philosophy fits perfectly into few-shot learning (FSL) research: using "smart tricks", one strives to train models that show good performance using a small amount of data.
Over the last few years, we have seen great progress in FSL research, thanks to work that has been done in pre-training, meta-learning, data augmentation, and public benchmark datasets. Since data collection and labeling are often expensive and time-consuming, breakthroughs in FSL research have huge potential use cases in the industry.
However, while FSL is an active research area and has great potential for many applications, off-the-shelf and user-friendly libraries have not been readily available for data scientists or software engineers to do quick exploration.
In the spirit of the martial art Tai Chi and its use of intelligent methods to achieve good performance with less effort, we developed an FSL library and named it TaiChi in the hopes that it will help others’ model training in low-data scenarios.
Here is our system in a nutshell:
To provide a better understanding of our approach, let’s take a closer look at how TaiChi works.
Our current release, TaiChi 1.0, contains two main FSL methods: DNNC and USLP. These are mainly for few-shot intent classification.
Why does TaiChi 1.0 use these two methods? Here’s a quick refresher:
The figure below provides a quick comparison of standard intent classification with DNNC and USLP. Both DNNC and USLP are based on NLI-style classification, but while DNNC reframes classification as entailment prediction between query and utterances in the training set, USLP simplifies DNNC by trying to predict the entailment relationship of utterance and semantic labels.
Results and findings of using these two methods:
We are also sharing the backbone models for DNNC and USLP. The models are based on public pre-trained models from Huggingface and further tuned with the NLI dataset to make them adapted to NLI-style classification.
Please refer to the NLI pre-training pipeline here (Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference) if you would like to pre-train a new model.
We use the CLINC150 Dataset for benchmarks and tutorials.
The original data_small.json is sub-sampled and further processed.
Users can download the processed dataset from here.
The TaiChi library serves as an API hub for various effective FSL methods proposed by the Salesforce Research team, which has done several FSL-related projects for research and application purposes.
The main contribution of this software: it provides a user-friendly API and allows engineers who don’t have experience with FSL to quickly play with it.
In the spirit of helping the wider community, TaiChi is an open-sourced work.
Salesforce AI Research invites you to dive deeper into the concepts discussed in this blog post (links below). Connect with us on social media and our mailing list to get regular updates on this and other research projects.
Sharvin Shah is a Research Engineer at Salesforce AI Research. He is interested in topics related to conversational AI, such as data augmentation for intent recognition and conversational language modeling.
Jin Qu is a Research Engineer at Salesforce AI Research. His work focuses on NLP products R&D, which includes few-shot learning, multi-/cross-lingual, and knowledge distillation.
Donald Rose is a Technical Writer at Salesforce AI Research. Specializing in content creation and editing, Dr. Rose works on multiple projects, including blog posts, video scripts, news articles, media/PR material, social media, writing workshops, and more. He also helps researchers transform their work into publications geared towards a wider audience.
A review of some key terms used in our discussion: