Tue Aug 09 2022
Sun Aug 07 2022

Few-shot Learning with Retrieval Augmented Language Model

Retrieval augmented models
Natural language processing
Machine learning
Question answering
Fact checking
Few-shot learning

Atlas reaches over 42% accuracy on Natural Questions using only 64 examples, outperforming PaLM by 3% despite having 50x fewer parameters.

Atlas is able to learn knowledge intensive tasks with very few training examples. It performs well on a wide range of tasks, including question answering and fact checking. It reaches over 42% accuracy on Natural Questions using only 64 examples, outperforming a larger model by 3% with significantly fewer parameters.

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models

Large language models
Natural language processing
Machine learning
Natural language processing tasks
Parallel training
Efficient language models

22B BTM, a communication-efficient LLM, performs as well as a Transformer LM trained with 2.5x more compute.

BTM is an algorithm for embarrassingly parallel training of large language models. It learns independent expert LMs specialized to different textual domains, which can be added or removed to update data coverage, ensembled to generalize to new domains, or averaged to collapse back to a single LM for efficient inference. Experiments show that BTM improves perplexities when compared to Transformer LMs, when controlling for training cost.

Wed Aug 03 2022
Tue Aug 02 2022
Thu Jul 28 2022
Wed Jul 27 2022