Ran Xu - Salesforce AI

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

We are excited to open-source 🍃MINT-1T, the first trillion token multimodal interleaved dataset and a valuable resource for the community to study and build large multimodal models.

24 Jul 2024 •

HIVE: Harnessing Human Feedback for Instructional Visual Editing

HIVE is accepted to CVPR 2024. Other authors include: Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong We have seen the success of ChatGPT, which incorporates human feedback to align text generated by large language models to human preferences. Is it possible to align

17 Jun 2024 •

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

07 Dec 2023 •

BannerGen: A Library for Multi-Modality Banner Generation

Background Graphic layout designs serve as the foundation of communication between media designers and their target audience. They play a pivotal role in organizing various visual elements, including rendered text, logos, product images, calls to action (such as buttons), and background textures/images. The arrangement of these elements is the

06 Dec 2023 •

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

Other authors include: Can Qin, Stefano Ermon, Yun Fu GlueGen was accepted by ICCV. In the rapidly advancing field of text-to-image synthesis, the remarkable progress in generating lifelike images from textual prompts has been evident. However, a significant challenge remains: how can we seamlessly integrate powerful pre-trained text encoders into

29 Sep 2023 •

A Leap Forward in 3D Understanding: The ULIP and ULIP-2

TL;DR: Imagine a world where machines comprehend 3D objects just as humans do. The ULIP (CVPR2023) and ULIP-2 projects, backed by Salesforce AI, are making this a reality by revolutionizing 3D understanding. ULIP uniquely pre-trains models with 3D point clouds, images, and texts, aligning them into a unified representation

23 May 2023 •

Burn After Reading: Preserving Privacy Using Online Adaptation for Cross-Domain Streaming Data

AUTHORS: Zeyuan Chen, Ran Xu, Luyu Yang, Donald Rose TL;DR: Many methods designed to preserve online privacy propose complex security measures to protect sensitive data. We believe that not storing any sensitive data is the optimal way to preserve privacy, so we propose a “burn after reading” online framework:

06 Oct 2022 • #online privacy