Mon Jul 25 2022
Thu Jul 21 2022

Language Model Cascades

Language Models
Artificial Intelligence
Machine Learning
Natural Language Processing
Machine Learning
Probabilistic Programming

Formalizes the paradigm of language model cadcades, which include scratchpads / chain of thought, verifiers, STaR, selection-inference, and tool use.

Provides insights into how compositions of multiple models can expand capabilities, especially in cases with control flow and dynamic structure, and the techniques required from probabilistic programming for this. Offers recommendations for implementing disparate model structures and inference strategies in a unified language.

Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?

Transformer Models
Artificial Intelligence
Machine Learning
Natural Language Processing
Machine Learning
Scaling Models

Conducts a systematic study of scaling behaviour of ten diverse model architectures such as Transformers, Switch Transformers, Universal Transformers, Dynamic convolutions, Performers, and recently proposed MLP-Mixers.

Highlights the importance of considering model architecture when performing scaling and how inductive bias affects scaling behaviour, with significant implications for how model architectures are evaluated in the community. Offers insights into upstream (pretraining) and downstream (transfer) influences.

Wed Jul 20 2022
Tue Jul 19 2022
Thu Jul 14 2022
Wed Jul 13 2022