Wed Feb 15 2023 - Top Trending AI Papers

The Capacity for Moral Self-Correction in Large Language Models

Ethics

Natural language processing

Machine learning

Ethics and compliance

Customer service

Content moderation

Finds that the capability for moral self-correction emerges at 22B model parameters, and typically improves with increasing model size and RLHF training.

Language models trained with reinforcement learning from human feedback (RLHF) have the capability to morally self-correct; can avoid producing harmful outputs when instructed to do so.

https://arxiv.org/pdf/2302.07459.pdf

https://arxiv.org/abs/2302.07459

https://twitter.com/arankomatsuzaki/status/1626035672141160449/photo/1

Energy Transformer

Artificial neural networks

Graph neural networks

Deep learning

Image completion

Graph anomaly detection

Predictive maintenance

Replaces the sequence of feedforward transformer blocks with a single large Associative Memory model.

Proposes a transformer architecture that is different from the existing architectures to represent relationships between tokens, and introduces the Energy Transformer (ET) architecture.

https://arxiv.org/pdf/2302.07253.pdf

https://arxiv.org/abs/2302.07253

https://twitter.com/arankomatsuzaki/status/1625887353670598657/photo/1

Big Little Transformer Decoder

Transformer architecture

Natural language processing

Deep learning

Machine translation

Summarization

Language modeling

Proposes Big Little Decoder, a framework that can improve inference efficiency and latency (2x) for a wide range of text generation applications w/o degradation of performance.

Introduces a framework that contains two models with different sizes that collaboratively generate text, and introduces two effective policies to coordinate the small and large models.

https://arxiv.org/pdf/2302.07863.pdf

https://arxiv.org/abs/2302.07863

https://twitter.com/arankomatsuzaki/status/1626031749175074818/photo/1

Learning Performance-Improving Code Edits

Software engineering

Artificial Intelligence

Computer Science

Optimizing compiler efficiency

Improving programming efficiency

Large language models (LLMs) can suggest functionally correct, performance improving code edits, allowing programmers to write efficient code. CODEGEN can match the performance of bigger CODEX on the task of suggesting such edits.

This paper demonstrates the ability of language models to suggest performance improving edits for coding, which can help improve the efficiency of programming workflows.

https://pie4perf.com/

https://arxiv.org/pdf/2302.07867.pdf

https://arxiv.org/abs/2302.07867

https://twitter.com/arankomatsuzaki/status/1626032689110306818/photo/1

Video Probabilistic Diffusion Models in Projected Latent Space

Video processing

Artificial Intelligence

Computer Science

Generating high-resolution videos

Improving video analysis workflows

Projected latent video diffusion models (PVDM) learns a video's distribution in a low-dimensional latent space and can efficiently generate high-resolution videos with complex temporal dynamics. PVDM improves the prior state-of-the-art by obtaining an FVD score of 639.7 on a long video (128 frames) generation benchmark.

This paper demonstrates the capacity of PVDM to efficiently generate high-resolution videos with complex temporal dynamics. It can be leveraged to improve video analysis workflows.

https://sihyun.me/PVDM/

https://arxiv.org/pdf/2302.07685.pdf

https://arxiv.org/abs/2302.07685

https://twitter.com/arankomatsuzaki/status/1626034909130248195/video/1