Sun Aug 07 2022 - Top Trending AI Papers

Tue Aug 09 2022

Sun Aug 07 2022

Few-shot Learning with Retrieval Augmented Language Model

Retrieval augmented models

Natural language processing

Machine learning

Question answering

Fact checking

Few-shot learning

Atlas reaches over 42% accuracy on Natural Questions using only 64 examples, outperforming PaLM by 3% despite having 50x fewer parameters.

Atlas is able to learn knowledge intensive tasks with very few training examples. It performs well on a wide range of tasks, including question answering and fact checking. It reaches over 42% accuracy on Natural Questions using only 64 examples, outperforming a larger model by 3% with significantly fewer parameters.

https://arxiv.org/pdf/2208.03299.pdf

https://arxiv.org/abs/2208.03299

https://twitter.com/arankomatsuzaki/status/1556438712598220800/photo/1

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models

Large language models

Natural language processing

Machine learning

Natural language processing tasks

Parallel training

Efficient language models

22B BTM, a communication-efficient LLM, performs as well as a Transformer LM trained with 2.5x more compute.

BTM is an algorithm for embarrassingly parallel training of large language models. It learns independent expert LMs specialized to different textual domains, which can be added or removed to update data coverage, ensembled to generalize to new domains, or averaged to collapse back to a single LM for efficient inference. Experiments show that BTM improves perplexities when compared to Transformer LMs, when controlling for training cost.

https://arxiv.org/pdf/2208.03306.pdf

https://arxiv.org/abs/2208.03306

https://twitter.com/arankomatsuzaki/status/1556437442256769024/photo/1