Thu Sep 22 2022 - Top Trending AI Papers

Sun Sep 25 2022

Thu Sep 22 2022

Mega: Moving Average Equipped Gated Attention

Natural language processing

Attention mechanism

Sequence modeling

Introduces Mega, a simple, theoretically grounded, single-head gated attention mechanism equipped with (exponential) moving average to incorporate inductive bias of position-aware local dependencies into the position-agnostic attention mechanism. Achieves significant improvements over other sequence models in a wide range of sequence modeling benchmarks.

Implement Mega in your sequence modeling tasks to improve performance and incorporate inductive bias of position-aware local dependencies.

https://arxiv.org/pdf/2209.10655.pdf

https://arxiv.org/abs/2209.10655

https://twitter.com/arankomatsuzaki/status/1573115165817843712/photo/1

A Generalist Neural Algorithmic Learner

Artificial intelligence

Graph neural network

Algorithmic reasoning

Multi-task learning

Constructs a generalist neural algorithmic learner - a single graph neural network processor capable of learning to execute a wide range of algorithms, such as sorting, searching, dynamic programming, path-finding and geometry. Improves average single-task performance by over 20% from prior art.

Implement a generalist neural algorithmic learner to execute a wide range of algorithms and improve performance in algorithmic tasks.

https://arxiv.org/pdf/2209.11142.pdf

https://arxiv.org/abs/2209.11142

https://twitter.com/arankomatsuzaki/status/1573112668835418113/photo/1