Sun Mar 05 2023
Thu Mar 02 2023

Dropout Reduces Underfitting

Neural networks
Deep learning, Regularization
Improving generalization accuracy in neural network training

Models equipped with early dropout achieve lower final training loss compared to their counterparts without dropout

Early dropout can mitigate underfitting when used at the start of training, leading to improved performance in underfitting models

Consistency Models

Diffusion models
Generative models, Image generation
Fast one-step generation, Zero-shot data editing

Proposes consistency models, a new family of generative models that achieve high sample quality without adversarial training

Consistency models can be trained as a way to distill pre-trained diffusion models or as standalone generative models, achieving high sample quality and fast one-step generation

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Speech recognition
Automatic speech recognition, Multilingual models
Multilingual ASR, Speech-to-text translation

Pre-trains a single model on a large unlabeled multilingual dataset of 12M hours spanning over 300 languages and fine-tunes on a smaller labeled dataset

The Universal Speech Model (USM) can perform automatic speech recognition (ASR) across 100+ languages, achieving state-of-the-art performance on downstream multilingual ASR and speech-to-text translation tasks

Human Motion Diffusion as a Generative Prior

Computer graphics
Motion generation
Human motion generation for gaming and animation
Long sequence generation
Few-shot and zero-shot settings

This paper shows that the gap in motion generation can be mitigated using a pre-trained diffusion-based model as a generative prior. The authors demonstrate the prior is effective for fine-tuning, in a few-, and even a zero-shot manner. They introduce DoubleTake, an inference-time method with which they demonstrate up to 10-minute long animations of prompted intervals and their meaningful and controlled transition.

This paper provides AI-based solutions for generating complex and long human motions from a small dataset, including few-shot and zero-shot settings. The proposed method can help businesses that rely on motion generation, such as gaming or animation companies, to improve their workflow and generate high-quality motions with limited data and resources.

Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control

Artificial intelligence
Machine learning
Robotics
Natural language processing
Robotic control
Language-conditioned robotic policies
Embodied agents

This paper proposes a guided decoding strategy to construct an action sequence that is both likely according to the language model and also realizable according to grounded models of the environment. The authors demonstrate that this guided decoding strategy is able to solve complex, long-horizon embodiment tasks in a robotic setting by leveraging the knowledge of both models.

This paper provides AI-based solutions for improving robotic performance by combining language models with grounded models of the environment. The proposed method can help businesses that rely on robotics, such as manufacturing or logistics companies, to improve their operational efficiency and automate complicated tasks in a real-world setting.

Wed Mar 01 2023
Tue Feb 28 2023
Mon Feb 27 2023
Sun Feb 26 2023