Sun Dec 18 2022 - Top Trending AI Papers

Teaching Small Language Models to Reason

Chain of thought generation

Artificial Intelligence

Machine Learning

Language model development

Automated reasoning

Natural language processing

Proposes the transfer of reasoning capabilities from large language models to smaller ones via knowledge distillation, achieving state-of-the-art results on arithmetic, commonsense, and symbolic reasoning datasets.

Can improve task performance of smaller language models and enable them to perform reasoning tasks that previously only large models could accomplish.

https://arxiv.org/pdf/2212.08410.pdf

https://arxiv.org/abs/2212.08410

https://twitter.com/arankomatsuzaki/status/1604656276943798278/photo/1

ALERT: Adapting Language Models to Reasoning Tasks

Complex reasoning tasks

Artificial Intelligence

Machine Learning

Language model development

Automated reasoning

Natural language processing

Introduces ALERT, a benchmark and suite of analyses for assessing language models' reasoning abilities and investigating the role of finetuning in learning reasoning skills.

Provides a test bed to assess any language model on fine-grained reasoning skills, and highlights the importance of finetuning in learning reasoning skills such as textual entailment, abductive reasoning, and analogical reasoning.

https://arxiv.org/pdf/2212.08286.pdf

https://arxiv.org/abs/2212.08286

https://twitter.com/arankomatsuzaki/status/1604653884789620737/photo/1

Self-Prompting Large Language Models for Open-Domain QA

Open-domain question answering

Artificial Intelligence

Machine Learning

Language model development

Question answering

Natural language processing

Proposes the use of Large Language Models (LLMs) as knowledge corpus for Open-Domain Question Answering (ODQA), and presents a Self-Prompting framework for LLMs to perform ODQA without the need for training data and external knowledge corpus.

Simplifies the ODQA architecture and achieves better results than previous state-of-the-art methods without a retriever.

https://arxiv.org/pdf/2212.08635.pdf

https://arxiv.org/abs/2212.08635

https://twitter.com/arankomatsuzaki/status/1604655373847232516/photo/1

FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference

Natural Language Processing (NLP)

Inference optimization

Language modeling

NLP tasks optimization

Language model implementation

Proposes two simple changes to the FiD architecture to speed up inference by 7x.

Provides actionable insights for faster and more efficient inference in retrieval-augmented language models and their implementation in various NLP tasks.

https://arxiv.org/pdf/2212.08153.pdf

https://arxiv.org/abs/2212.08153

https://twitter.com/arankomatsuzaki/status/1604654279482032130/photo/1

An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation

NLP

Open-ended text generation

Text generation evaluation

Text generation improvement

Evaluation metric development

Humans prefer Contrastive Search over Contrastive Decoding. MAUVE score doesn't correlate with human preference.

Provides recommendations for better evaluation metrics for open-ended text generation and improving the diversity and coherence of generated text.

https://arxiv.org/pdf/2211.10797.pdf

https://arxiv.org/abs/2211.10797

https://github.com/yxuansu/Contrastive_Search_versus_Contrastive_Decoding

https://twitter.com/arankomatsuzaki/status/1604679143072956419/photo/1