Thu Feb 16 2023
Wed Feb 15 2023

The Capacity for Moral Self-Correction in Large Language Models

Ethics
Natural language processing
Machine learning
Ethics and compliance
Customer service
Content moderation

Finds that the capability for moral self-correction emerges at 22B model parameters, and typically improves with increasing model size and RLHF training.

Language models trained with reinforcement learning from human feedback (RLHF) have the capability to morally self-correct; can avoid producing harmful outputs when instructed to do so.

Energy Transformer

Artificial neural networks
Graph neural networks
Deep learning
Image completion
Graph anomaly detection
Predictive maintenance

Replaces the sequence of feedforward transformer blocks with a single large Associative Memory model.

Proposes a transformer architecture that is different from the existing architectures to represent relationships between tokens, and introduces the Energy Transformer (ET) architecture.

Big Little Transformer Decoder

Transformer architecture
Natural language processing
Deep learning
Machine translation
Summarization
Language modeling

Proposes Big Little Decoder, a framework that can improve inference efficiency and latency (2x) for a wide range of text generation applications w/o degradation of performance.

Introduces a framework that contains two models with different sizes that collaboratively generate text, and introduces two effective policies to coordinate the small and large models.

Learning Performance-Improving Code Edits

Software engineering
Artificial Intelligence
Computer Science
Optimizing compiler efficiency
Improving programming efficiency

Large language models (LLMs) can suggest functionally correct, performance improving code edits, allowing programmers to write efficient code. CODEGEN can match the performance of bigger CODEX on the task of suggesting such edits.

This paper demonstrates the ability of language models to suggest performance improving edits for coding, which can help improve the efficiency of programming workflows.

Video Probabilistic Diffusion Models in Projected Latent Space

Video processing
Artificial Intelligence
Computer Science
Generating high-resolution videos
Improving video analysis workflows

Projected latent video diffusion models (PVDM) learns a video's distribution in a low-dimensional latent space and can efficiently generate high-resolution videos with complex temporal dynamics. PVDM improves the prior state-of-the-art by obtaining an FVD score of 639.7 on a long video (128 frames) generation benchmark.

This paper demonstrates the capacity of PVDM to efficiently generate high-resolution videos with complex temporal dynamics. It can be leveraged to improve video analysis workflows.

Tue Feb 14 2023
Mon Feb 13 2023
Sun Feb 12 2023
Sat Feb 11 2023