Transformers are Sample Efficient World Models
IRIS, a data-efficient agent that learns in a world model composed of a discrete autoencoder and an autoregressive Transformer, achieves a mean human normalized score of 1.046, and outperforms humans on 10 out of 26 games in the Atari 100k benchmark.
IRIS shows promise in improving sample efficiency in reinforcement learning and sets a new state-of-the-art for methods without lookahead search.
Visual Prompting via Image Inpainting
Visual prompting, posed as simple image inpainting, is effective when the inpainting algorithm is trained on the right data. This approach was demonstrated on various image-to-image tasks, including foreground segmentation, single object detection, colorization, edge detection, etc.
Visual prompting offers a promising way to adapt pre-trained visual models to novel downstream tasks without task-specific finetuning or model modification.