ALPRO: Understanding Video and Language by Aligning Visual Regions and Text Entities

Lead Author: Dongxu Li TL;DR: We propose ALPRO, a new video-and-language representation learning framework which achieves state-of-the-art performance on video-text retrieval and video question answering by learning fine-grained alignment between video regions and textual entities via entity prompts. For more background (a review of key concepts used in this

31 May 2022 • #ALPRO

CoMatch: Advancing Semi-supervised Learning with Contrastive Graph Regularization

TL; DR: We propose a new semi-supervised learning method which achieves state-of-the-art performance by learning jointly-evolved class probabilities and image representations.What are the existing semi-supervised learning methods?Semi-supervised learning aims to leverage few labeled data and a large amount of unlabeled data. As a long-standing and widely-studied topic in

23 Nov 2020 •

MoPro: Webly Supervised Learning with Momentum Prototypes

TL; DR: We propose a new webly-supervised learning method which achieves state-of-the-art representation learning performance by training on large amounts of freely available noisy web images.Deep neural networks are known to be hungry for labeled data. Current state-of-the-art CNNs are trained with supervised learning on datasets such as ImageNet

17 Sep 2020 • #webly supervised learning