Tue Jan 31 2023 - Top Trending AI Papers

Faithful Chain-of-Thought Reasoning

Natural Language Processing

Language models

Reasoning tasks

Improving language model performance in complex reasoning tasks

Faithful CoT outperforms CoT on 9 out of 10 reasoning datasets by decomposing a reasoning task into two stages.

Offers a framework that improves the performance of language models in complex reasoning tasks by combining an LM and a deterministic solver. This approach is demonstrated to be effective on 10 reasoning datasets from 4 diverse domains, achieving new state-of-the-art few-shot performance on 7 out of the 10 datasets.

https://arxiv.org/pdf/2301.13379.pdf

https://arxiv.org/abs/2301.13379

https://twitter.com/arankomatsuzaki/status/1620601511960272902/photo/1

Mathematical Capabilities of ChatGPT

Artificial Intelligence

Language models

Mathematics

Assessing the mathematical capabilities of language models

Exploring natural language datasets for advanced mathematical comprehension

Presents a new dataset: GHOSTS, which is the first natural-language dataset that covers grad-level math and provides a holistic overview of the mathematical capabilities of language models.

Shows that ChatGPT's mathematical abilities are significantly below those of an average mathematics graduate student. Hence, it is not suitable for use in a university exam, but can still be helpful for some mathematical use cases that come up in the daily professional activities of mathematicians.

https://arxiv.org/pdf/2301.13867.pdf

https://arxiv.org/abs/2301.13867

https://twitter.com/arankomatsuzaki/status/1620597213100675073/photo/1

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

Machine Learning

Language models

Instruction tuning

Improving language model performance in various tasks

Finds that task balancing and enrichment tricks are critical to performance. Makes Flan datasets & templates publicly available.

Provides insights into the design decisions of instruction tuning methods, which can improve the performance of language models in various tasks. It also makes the Flan 2022 collection of datasets, templates, and methods publicly available to accelerate research on instruction tuning.

https://arxiv.org/pdf/2301.13688.pdf

https://arxiv.org/abs/2301.13688

https://github.com/google-research/FLAN/tree/main/flan/v2

https://twitter.com/arankomatsuzaki/status/1620598893699534848/photo/1

Grounding Language Models to Images for Multimodal Generation

Multimodal machine learning

Natural Language Processing

Computer Vision

Contextual image retrieval

Multimodal dialogue

Visually grounded settings

Efficient method proposed to ground text-only language models to the visual domain, achieving strong zero-shot performance on grounded tasks such as contextual image retrieval and multimodal dialogue.

Enables businesses to process and generate arbitrarily interleaved image-and-text data, providing an effective, general solution for leveraging pretrained language models in visually grounded settings.

https://jykoh.com/fromage

https://arxiv.org/pdf/2301.13823.pdf

https://arxiv.org/abs/2301.13823

https://twitter.com/arankomatsuzaki/status/1620603460164141056/photo/1

Scaling laws for single-agent reinforcement learning

Generative modeling

Artificial Intelligence

Machine Learning

Reinforcement learning

Mean episode return

Finds that intrinsic performance scales as a power law in model size and environment interactions, with the optimal model size scaling as a power law in the training compute budget.

Suggests that investing more in training compute may not always lead to better performance and that optimal model size depends on the environment and other properties of the training setup.

https://arxiv.org/pdf/2301.13442.pdf

https://arxiv.org/abs/2301.13442

https://twitter.com/arankomatsuzaki/status/1620600430266060805/photo/1