Blog

The Latest and Greatest from Salesforce Research

How Salesforce Builds Reproducible Red Teaming Infrastructure

Introduction Imagine you’re working on an AI product that can summarize customer success phone calls for training purposes. Your company’s product leverages large language models (LLMs) to summarize, synthesize, triage, and generate relevant outputs. You’re aware that LLMs can hallucinate, output harmful or biased text, or be

09 Oct 2024 • Daniel Nissani

Accelerating Your Model Evaluation and Fine-tuning with SFR-Judge

As the development and deployment of large language models (LLMs) accelerates, evaluating model outputs has become increasingly important. The established method of evaluating responses typically involves recruiting and training human evaluators, having them evaluate the model responses, and then auditing the quality of the evaluations. Unfortunately, this process does not

26 Sep 2024 • Austin Xu

Building Contextually Faithful RAG Applications with SFR-RAG

Retrieval Augmented Generation (RAG) has not only gained steam as one of the most invested areas of research in generative AI but also gathered considerable popularity and commercialization opportunities. RAG is typically applied to question-answering problems, where certain external contextual information retrieved from a data source (potentially private) is provided

17 Sep 2024 • Xuan Phi Nguyen

xLAM: A Family of Large Action Models for AI Agents

Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Caiming Xiong TL;DR: We release xLAM, a series of LLMs optimized for function calling and AI Agents. It offers several variants designed to serve different application domains, from mobile usage to high-demand performance contexts. They show competitive performance across various key agent

06 Sep 2024 • Huan Wang

Introducing LlamaRank: A state-of-the-art reranker for trusted AI

As part of our commitment to innovation in enterprise RAG and trusted AI, we're excited to release SFR LlamaRank, a state-of-the-art reranker from Salesforce AI Research. LlamaRank is a language model specialized for document relevancy ranking. LlamaRank achieves performance at least comparable to leading APIs across general document

26 Aug 2024 • Antonio Ginart

We're Hiring! Trusted AI Roles at Salesforce

Meet the Office of Ethical and Humane Use Salesforce's Office of Ethical and Humane Use provides navigational guidance for the tough questions that arise when human potential meets emergent technology. We work across the company to guide the design, development, and deployment of trusted products, with a strong

11 Jul 2024 • Christina Zhang #AI ethics

INDICT: Towards Better Code Generation by Both Security and Helpfulness

TL;DR: We introduce INDICT, a novel framework that empowers Large Language Models (LLMs) with Internal Dialogues of Critiques for both safety and helpfulness guidance. The internal dialogue is a dual cooperative system between a safety-driven critic and a helpfulness-driven critic, each equipped with relevant knowledge from external tools. LLMs

04 Jul 2024 • Henry Hung Le

HIVE: Harnessing Human Feedback for Instructional Visual Editing

HIVE is accepted to CVPR 2024. Other authors include: Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong We have seen the success of ChatGPT, which incorporates human feedback to align text generated by large language models to human preferences. Is it possible to align

17 Jun 2024 • Shu Zhang

Moirai: A Time Series Foundation Model for Universal Forecasting

TL;DR: Moirai is a cutting-edge time series foundation model, offering universal forecasting capabilities. It stands out as a versatile time series forecasting model capable of addressing diverse forecasting tasks across multiple domains, frequencies, and variables in a zero-shot manner.  To achieve this, Moirai tackles four major challenges: (i) construction

19 Mar 2024 • Gerald Woo #foundation model

Trusted NLG Research @ Salesforce AI

While we’ve seen amazing improvements in model performance over the last several years, we must be aware of the remaining downsides of these models. We believe that by jointly improving these models as well as evolving our approaches to evaluating them is essential going forward.

28 Feb 2024 • Alex Fabbri #natural language generation

Aligning Diffusion Models to Human Preferences

TLDR Learning from human preferences, specifically Reinforcement Learning from Human Feedback (RLHF) has been a key recent component in the development of large language models such as ChatGPT or Llama2. Up until recently, the impact of human feedback training on text-to-image models was much more limited. In this work, Diffusion-DPO,

08 Jan 2024 • Bram Wallace #reinforcement-learning

The Ever-Growing Power of Small Models

Recent AI media coverage has followed a familiar pattern: a massive new model is released, making the rounds with beta testers and eventually the public, but it’s barely a month or two before rumors start to swell about the even bigger one supposedly being trained to replace it. Yet

21 Dec 2023 • Silvio Savarese

Salesforce Research at NeurIPS 2023

Conference Overview Next week, the Thirty-seventh annual Conference on Neural Information Processing Systems (NeurIPS) will be held in New Orleans, Louisiana from Sunday, December 10th, through Saturday, December 16th. NeurIPS will include invited talks, demonstrations, oral and poster presentations of accepted papers. NeurIPS 2023 will be held again at the

07 Dec 2023 • Mia Ferrer

BannerGen: A Library for Multi-Modality Banner Generation

Background Graphic layout designs serve as the foundation of communication between media designers and their target audience. They play a pivotal role in organizing various visual elements, including rendered text, logos, product images, calls to action (such as buttons), and background textures/images. The arrangement of these elements is the

06 Dec 2023 • Chia-Chih Chen

From Copilot to CoOrchestration

Einstein Copilot has arrived! Find out more about the conversational AI for CRM here. Introduction I’ve written a lot in recent months about what I call Large Action Models, or LAMs—a more active, autonomous variation on LLMs that don’t merely generate content like text or images but

20 Oct 2023 • Silvio Savarese

CodeChain: Towards Modular Code Generation through Chain of Self-revisions and Representative Sub-modules

TL;DR: With CodeChain, a pretrained large language model (LLM) can solve challenging coding problems by integrating modularity in generation samples and self-improve by employing a chain of self-revisions on representative sub-modules. CodeChain can achieve state-of-the-art results with both OpenAI GPT models and open-source LLMs on challenging coding benchmarks like

20 Oct 2023 • Henry Hung Le

Using language models to design antibodies to combat autoimmune disorders

TL;DR: We adapted our protein language model ProGen to optimize antibodies that bind to a protein called “CD40L”, a critical target for autoimmune disorders. We tested our AI designed antibodies in the laboratory and found that they bound very tightly to CD40L, showcasing the potential of this approach for

13 Oct 2023 • Ben Krause

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

Other authors include: Can Qin, Stefano Ermon, Yun Fu GlueGen was accepted by ICCV. In the rapidly advancing field of text-to-image synthesis, the remarkable progress in generating lifelike images from textual prompts has been evident. However, a significant challenge remains: how can we seamlessly integrate powerful pre-trained text encoders into

29 Sep 2023 • Ning Yu

Open Source and the Future of Enterprise AI

Einstein Copilot has arrived! Find out more about the conversational AI for CRM here. Introduction Open source has become one of the hottest topics in AI, and the fanfare is well-deserved. The open source community is keeping a nimble pace with the state of the art, delivering ever-growing and ever-more-capable

25 Sep 2023 • Silvio Savarese

Prototyping XGen-Image-1

TLDR Generative AI methods for image generation have a wide variety of potential applications in marketing, sales, and e-commerce. With these applications in mind, the Salesforce Research team has developed several techniques based on image-generative diffusion models, including methods for image editing, improved classifier guidance, and improved controlled generation methods.

03 Aug 2023 • Bram Wallace

PyRCA: Making Root Cause Analysis Easy in AIOps

TL;DR: PyRCA is an open-source machine learning library specifically designed for conducting Root Cause Analysis (RCA) in IT operations. It offers a comprehensive framework that allows users to easily identify the complicated metric causal dependencies and automatically locate the root causes of incidents. The library provides a unified interface

11 Jul 2023 • Chenghao Liu #root cause analysis

CodeGen2.5: Small, but mighty

Equal contribution between Erik Nijkamp and Hiroaki Hayashi. Paper Code Tweet Abstract The family of Salesforce CodeGen models is growing with CodeGen2.5 – a small, but mighty model! While there has been a recent trend of large language models (LLM) of increasing size, we show that a small model can

06 Jul 2023 • Erik Nijkamp #CodeGen

Toward Actionable Generative AI

LAMs: From Large Language Models to Large Action Models There’s no question that we’re living in the era of generative AI, and its impact is only growing. More and more, AI is helping us write emails, create imagery, consume information, and even code. But as empowering as it

27 Jun 2023 • Silvio Savarese

A Leap Forward in 3D Understanding: The ULIP and ULIP-2

TL;DR: Imagine a world where machines comprehend 3D objects just as humans do. The ULIP (CVPR2023) and ULIP-2 projects, backed by Salesforce AI, are making this a reality by revolutionizing 3D understanding. ULIP uniquely pre-trains models with 3D point clouds, images, and texts, aligning them into a unified representation

23 May 2023 • Le Xue

CodeT5+: Open Code Large Language Models

TL;DR: CodeT5+ is a new family of open code large language models (LLMs) with improved model architectures and training techniques. CodeT5+ achieves the state-of-the-art performance among the open-source LLMs on many challenging code intelligence tasks, including zero-shot evaluation on the code generation benchmark HumanEval.   Background: Code LLMs Large language

20 May 2023 • Yue Wang #codet5+

LogAI: A Library for Log Analytics and Intelligence

TL;DR LogAI is an open-source library designed for log analytics and intelligence. It can process raw logs generated by computer systems and support log analytics tasks such as log clustering and summarization, as well as log intelligence tasks such as log anomaly detection and root-cause analysis. LogAI is compatible

06 Apr 2023 • Doyen Sahoo

In Loving Memory of Dragomir Radev

The Salesforce AI Team is mourning the loss of our beloved friend and mentor, Dragomir Radev. Our team was first introduced to Drago in November 2018 when he gave a talk at our Research Speaker Series. His passion for research beamed through his talk and our leadership team unanimously decided

04 Apr 2023 • Audrey Cook