Thu Sep 22 2022
Sun Sep 18 2022

Human-level Atari 200x faster

Gaming
Artificial Intelligence
Reinforcement Learning
Reinforcement learning in business operations

Agent57 was the first agent to surpass the human benchmark on all 57 games, but this came at the cost of poor data-efficiency, requiring nearly 80 billion frames of experience to achieve. Taking Agent57 as a starting point, the research employs a diverse set of strategies to achieve a 200-fold reduction of experience needed to outperform the human baseline.

This research proposes a more efficient method to build general agents that can perform well over a wide range of tasks. The proposed method requires 200 times less experience to outperform the human baseline, making it more data-efficient and robust. This can be valuable for businesses that use reinforcement learning to optimize their processes and workflows.

Text and Patterns: For Effective Chain of Thought It Takes Two to Tango

Few-Shot Techniques
Artificial Intelligence
Natural Language Processing
Natural Language Processing in business operations

This work uses counterfactual prompting to develop a deeper understanding of CoT-based few-shot prompting mechanisms in large language models. Our empirical and qualitative analysis reveals that a symbiotic relationship between text and patterns explains the success of few-shot prompting: text helps extract commonsense from the question to help patterns, and patterns enforce task understanding and direct text generation.

The research identifies the key components of a prompt of CoT and devises a method called CCOT (Concise CoT), which delivers similar or slightly higher solve task rates. This can be useful for businesses that rely on natural language processing and few-shot techniques to augment their communication and decision-making processes.

Wed Sep 14 2022
Sun Sep 11 2022
Thu Sep 01 2022
Tue Aug 30 2022