Tue Jun 21 2022
Sun Jun 19 2022

MineDojo

Reinforcement learning
AI for gaming
embodied agents

Introduces MineDojo, a new framework built on the popular Minecraft game that features a simulation suite with thousands of diverse open-ended tasks and an internet-scale knowledge base.

MineDojo is an open-source framework that provides a simulation suite and knowledge base for building generalist agents that can solve a variety of open-ended tasks. It promotes research towards the goal of generally capable embodied agents.

Unified-IO

Transformer-based architecture
Computer vision
Natural language processing
vision and language fields

Performs a large variety of AI tasks spanning classical CV tasks, VL tasks to NLP tasks such as QA and paraphrasing by casting input and output into a sequence of discrete vocabulary tokens.

Unified-IO is a single unified model that can perform a large variety of AI tasks without task-specific fine-tuning. It achieves this unification by homogenizing every supported input and output into a sequence of discrete vocabulary tokens. Code and demos for Unified-IO are available for researchers to use.

Evolution through Large Models

Language models
Deep learning
genetic programming

Pursues the insight that LLMs trained to generate code can vastly improve the effectiveness of mutation operators applied to programs in genetic programming.

Evolution through Large Models (ELM) explores the implications of large language models (LLMs) trained to generate code in genetic programming. LLMs can approximate likely changes that humans would make, which can help bootstrap new models that can output appropriate artifacts for a given context in a domain without training data. This carries implications for open-endedness, deep learning, and reinforcement learning.

Bootstrapped Transformer for Offline Reinforcement Learning

Offline reinforcement learning
Machine learning
Reinforcement learning
Sequence generation

Proposes a novel algorithm named Bootstrapped Transformer and leverages the learned model to self-generate more offline data to further boost the sequence model training. Significantly outperforms Trajectory Transformer on D4RL

Can improve business operations and workflows that involve reinforcement learning by addressing the limited dataset problem and improving performance of sequence generation models

Bridge-Tower: Building Bridges Between Encoders in Vision-Language Representation Learning

Two-tower architecture
Machine learning
Vision-language representation learning
Natural language processing
Computer vision

Introduces multiple bridge layers that build a connection between the top layers of uni-modal encoders and each layer of the cross-modal encoder. Achieves SotA performance on various downstream vision-language tasks after pre-trained with only 4M images

Can help businesses improve their natural language processing and computer vision operations with better vision-language representation learning

Thu Jun 16 2022
Wed Jun 15 2022
Tue Jun 14 2022
Mon Jun 13 2022