Tue Dec 06 2022
Mon Dec 05 2022

Meta-Learning Fast Weight Language Models

NLP
Neural Networks
Language Models
Language modeling

Presents Fast Weight Layers (FWLs), a neural component that provides the benefits of dynamic evaluation much more efficiently by expressing gradient updates as linear attention.

FWLs can easily be added on top of existing transformer models, require relatively little extra compute or memory to run, and significantly improve language modeling perplexity.

Leveraging Large Language Models for Multiple Choice Question Answering

NLP
Neural Networks
Language Models
Natural language processing
Question answering

Finds that code/text-davinci performs much better on MCQ if the candidate answers are characters like 'A', 'B', etc unlike the original GPT3.

The natural approach allows the model to explicitly compare answer options, reduces computational costs, and mitigates the effects of tokenization scheme and answer option representations on answer selection.

Sun Dec 04 2022
Tue Nov 29 2022
Wed Nov 23 2022
Tue Nov 22 2022