Mon Dec 19 2022 - Top Trending AI Papers

The case for 4-bit precision: k-bit Inference Scaling Laws

Deep learning

Machine learning

Natural Language Processing

Model compression

Quantization methods

Language models

Shows that 4-bit precision is almost universally optimal for total model bits and zero-shot accuracy.

Implementing quantization methods to reduce memory footprint and inference latencies while maintaining zero-shot accuracy.

https://arxiv.org/pdf/2212.09720.pdf

https://arxiv.org/abs/2212.09720

https://twitter.com/arankomatsuzaki/status/1605026012101738496/photo/1

Natural Language to Code Generation in Interactive Data Science Notebooks

Natural Language Processing

Artificial Intelligence

Programming

Automatic code generation

Data science notebooks

Pandas data analysis framework

Builds a benchmark and a language model for automatic code generation in data science notebooks.

Improving the accuracy and efficiency of AI pair programmers to synthesize code given natural language intents.

https://arxiv.org/pdf/2212.09248.pdf

https://arxiv.org/abs/2212.09248

https://twitter.com/arankomatsuzaki/status/1605024924027932677/photo/1

Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor

Deep learning

Natural Language Processing

Machine learning

Pretrained language models

Instruction tuning

Automatically generated datasets

Demonstrates that training on a large dataset of automatically generated instructions leads to a model that outperforms models trained on manually-curated datasets.

Using model-generated data as a cost-effective alternative to crowdsourcing for dataset expansion and diversification.

https://arxiv.org/pdf/2212.09689.pdf

https://arxiv.org/abs/2212.09689

https://twitter.com/arankomatsuzaki/status/1605018849606533121/photo/1

DSI++: Updating Transformer Memory with New Documents

Natural Language Processing

Artificial Intelligence for Information Retrieval

Continual learning

Incremental indexing of new documents

Answering queries related to both previously and newly indexed documents

Continual indexing benchmarks based on Natural Questions and MS MARCO

Presents DSI++, a continual learning challenge for differential search indices to incrementally index new docs while being able to answer queries related to both previously and newly indexed docs.

Provides a solution for deploying differential search indices in situations where the corpus changes over time, mitigating forgetting by a significant margin and improving the average Hits@10 by over competitive baselines.

https://arxiv.org/pdf/2212.09744.pdf

https://arxiv.org/abs/2212.09744

https://twitter.com/arankomatsuzaki/status/1605016930259275776/photo/1

Scalable Diffusion Models with Transformers

Generative adversarial networks

Computer Vision

Deep Learning

Training latent diffusion models of images

Class-conditional ImageNet benchmarks

Explores a new class of diffusion models based on the transformer architecture, achieving a state-of-the-art FID of 2.27 on class-conditional ImageNet 256x256 benchmarks.

Offers a scalable and efficient solution for training latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches.

https://arxiv.org/pdf/2212.09748.pdf

https://arxiv.org/abs/2212.09748

https://twitter.com/arankomatsuzaki/status/1605016267970252802/photo/1