Mon Dec 05 2022 - Top Trending AI Papers

Tue Dec 06 2022

Mon Dec 05 2022

Meta-Learning Fast Weight Language Models

NLP

Neural Networks

Language Models

Language modeling

Presents Fast Weight Layers (FWLs), a neural component that provides the benefits of dynamic evaluation much more efficiently by expressing gradient updates as linear attention.

FWLs can easily be added on top of existing transformer models, require relatively little extra compute or memory to run, and significantly improve language modeling perplexity.

https://arxiv.org/pdf/2212.02475.pdf

https://arxiv.org/abs/2212.02475

https://twitter.com/arankomatsuzaki/status/1599946023836651522/photo/1

Leveraging Large Language Models for Multiple Choice Question Answering

NLP

Neural Networks

Language Models

Natural language processing

Question answering

Finds that code/text-davinci performs much better on MCQ if the candidate answers are characters like 'A', 'B', etc unlike the original GPT3.

The natural approach allows the model to explicitly compare answer options, reduces computational costs, and mitigates the effects of tokenization scheme and answer option representations on answer selection.

https://arxiv.org/pdf/2210.12353.pdf

https://arxiv.org/abs/2210.12353

https://twitter.com/arankomatsuzaki/status/1599797746583699459/photo/1