Tue Nov 29 2022
Wed Nov 23 2022

TorchScale: Transformers at Scale

PyTorch
Deep learning
Language modeling
Neural machine translation
Vision pretraining

Presenting an open-source toolkit for scaling up Transformers, allowing for improved modeling generality and capability, training stability and efficiency. Demonstrates successful scaling to different sizes in language modeling and neural machine translation.

Enables efficient and effective scaling of Transformers for language modeling and neural machine translation, improving modeling generality and capability, training stability and efficiency.

Self-Supervised Learning based on Heat Equation

Self-supervised learning
Computer vision
Image classification
Object detection

Proposes a self-supervised learning method, QB-Heat, based on heat equation extension into high dimensional feature space. QB-Heat enables simple masked image modeling for CNNs that works well for pre-training light-weight networks suitable for image classification and object detection.

Introduces QB-Heat, a simple self-supervised learning method that enables masked image modeling for CNNs that works well for pre-training light-weight networks suitable for image classification and object detection.

Retrieval-Augmented Multimodal Language Modeling

CLIP model
Transformer architecture
Multimodal learning
Natural language processing
Image generation
Caption generation

Introduces RA-CM3, a retrieval-augmented multimodal model that enables a base multimodal model to refer to relevant knowledge fetched by a retriever from external memory. RA-CM3 significantly outperforms baseline models on image and caption generation tasks while requiring less compute for training.

Presents RA-CM3, a retrieval-augmented multimodal model that outperforms baseline models on image and caption generation tasks while requiring less compute for training.

Masked Autoencoding for Scalable and Generalizable Decision Making

Sequential Data
Machine Learning
Artificial Intelligence
Reinforcement Learning
Behavioral Cloning
Zero-shot Transfer

Presents MaskDP, a self-supervised pretraining method for RL that outperforms GPT-like approaches.

Provides a scalable and generalizable decision-making process through self-supervised pretraining, with zero-shot transfer capability to new tasks and promising scaling behavior in offline RL.

Inversion-Based Creativity Transfer with Diffusion Models

Probabilistic Models
Computer Vision
Artificial Intelligence
Artistic Style Transfer
Image Synthesis
Painting Analysis

Learns artistic creativity directly from a single painting and then guide the synthesis without providing complex textual descriptions.

Enables transfer of artistic style from a single painting to guide synthesis without complex textual descriptions, improving arbitrary example-guided artistic image generation methods.

Tue Nov 22 2022
Thu Nov 17 2022
Wed Nov 09 2022
Tue Nov 08 2022