Scaling Laws for a Multi-Agent Reinforcement Learning Model
Study of performance scaling for AlphaZero agents in Connect Four and Pentago games. Finds similar scaling exponents to language models and underparametrization of SotA game-playing models for available compute.
Suggests optimal neural network size for improved performance in reinforcement learning models, and highlights the underutilization of available compute in SotA game-playing models.
Calibrating Sequence likelihood Improves Conditional Language Generation
Introduces sequence likelihood calibration (SLiC) to align model-generated sequence likelihood to reference sequences, improving decoding candidates' quality without reduction in performance with model scale.
Shows that SLiC exceeds or matches SotA results in generation tasks and presents a way to improve quality with limited training and inference budgets.
Improving Sample Quality of Diffusion Model Using Self-Attention Guidance
Introduces label-free guidance through self-attention maps for enhancing quality of generated images in diffusion models.
Suggests SAG as a strategy to improve quality of generated images and shows its efficacy in various diffusion models.