Teaching Small Language Models to Reason
Proposes the transfer of reasoning capabilities from large language models to smaller ones via knowledge distillation, achieving state-of-the-art results on arithmetic, commonsense, and symbolic reasoning datasets.
Can improve task performance of smaller language models and enable them to perform reasoning tasks that previously only large models could accomplish.
ALERT: Adapting Language Models to Reasoning Tasks
Introduces ALERT, a benchmark and suite of analyses for assessing language models' reasoning abilities and investigating the role of finetuning in learning reasoning skills.
Provides a test bed to assess any language model on fine-grained reasoning skills, and highlights the importance of finetuning in learning reasoning skills such as textual entailment, abductive reasoning, and analogical reasoning.
Self-Prompting Large Language Models for Open-Domain QA
Proposes the use of Large Language Models (LLMs) as knowledge corpus for Open-Domain Question Answering (ODQA), and presents a Self-Prompting framework for LLMs to perform ODQA without the need for training data and external knowledge corpus.
Simplifies the ODQA architecture and achieves better results than previous state-of-the-art methods without a retriever.
FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference
Proposes two simple changes to the FiD architecture to speed up inference by 7x.
Provides actionable insights for faster and more efficient inference in retrieval-augmented language models and their implementation in various NLP tasks.
An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation
Humans prefer Contrastive Search over Contrastive Decoding. MAUVE score doesn't correlate with human preference.
Provides recommendations for better evaluation metrics for open-ended text generation and improving the diversity and coherence of generated text.