Tue Jan 31 2023
Mon Jan 30 2023

Extracting Training Data from Diffusion Models

generative models
computer vision
privacy and security
image generation and recognition
privacy and security
data extraction and analysis

Diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, over a thousand training examples are extracted from state-of-the-art models, ranging from photographs of individual people to trademarked company logos. Overall, the results show that diffusion models are much less private than prior generative models such as GANs, and that mitigating these vulnerabilities may require new advances in privacy-preserving training.

Businesses using diffusion models for generating synthetic images need to be aware of the potential privacy risks associated with using such models for training data extraction.

REPLUG: Retrieval-Augmented Black-Box Language Models

language models
natural language processing
machine learning
language modeling
natural language processing
retrieval

REPLUG is a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model. REPLUG significantly improves the performance of GPT-3 (175B) on language modeling by 6.3%, as well as the performance of Codex on five-shot MMLU by 5.1%.

Businesses looking to improve their language modeling tasks can use REPLUG to significantly improve the performance of their existing models without having to retrain them from scratch.

Looped Transformers as Programmable Computers

transformer networks
machine learning
algorithm development
computing
algorithm development
machine learning

A framework is presented for using transformer networks as universal computers by programming them with specific weights and placing them in a loop. Using these building blocks, a small instruction-set computer is emulated, which allows iterative algorithms to be mapped to programs that can be executed by a looped, 13-layer transformer. The transformer can emulate a basic calculator, a basic linear algebra library, and in-context learning algorithms that employ backpropagation.

Businesses looking to develop custom algorithms for their specific use cases can use looped transformers to emulate basic computing blocks and execute full-fledged, general-purpose programs.

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

Text-to-speech synthesis
Natural Language Processing
Audio Processing
audio content generation for marketing and e-learning
audio responses for chatbots and virtual assistants
audiobook production

AudioLDM is a text-to-audio (TTA) system built on a latent space to learn the continuous audio representations from contrastive language-audio pretraining (CLAP) latents. It achieves SotA TTA performance with improved generation quality and computational efficiency, and enables text-guided audio manipulations in a zero-shot fashion.

Businesses can use AudioLDM to generate high-quality audio for various applications, such as marketing videos, e-learning materials, and audiobooks. It can also be used to enhance customer experience by providing natural-sounding audio responses in chatbots and virtual assistants.

Sample Efficient Deep Reinforcement Learning via Local Planning

Simulator-based RL
Reinforcement Learning
Artificial Intelligence
decision-making optimization
game AI

Uncertainty-first local planning (UFLP) is an algorithmic framework for sample-efficient deep reinforcement learning (RL) with a simulator. It improves the sample cost of several baseline RL algorithms on difficult exploration tasks, achieving super-human performance on Montezuma's Revenge.

Businesses can use UFLP to optimize various decision-making scenarios, such as inventory management, resource allocation, and pricing strategies. It can also be applied to game AI to enhance player experience and engagement.

Sun Jan 29 2023
Thu Jan 26 2023
Wed Jan 25 2023
Tue Jan 24 2023