Tue Dec 06 2022
Mon Dec 05 2022
Meta-Learning Fast Weight Language Models
NLP
Neural Networks
Language Models
Language modeling
Presents Fast Weight Layers (FWLs), a neural component that provides the benefits of dynamic evaluation much more efficiently by expressing gradient updates as linear attention.
FWLs can easily be added on top of existing transformer models, require relatively little extra compute or memory to run, and significantly improve language modeling perplexity.
Leveraging Large Language Models for Multiple Choice Question Answering
NLP
Neural Networks
Language Models
Natural language processing
Question answering
Finds that code/text-davinci performs much better on MCQ if the candidate answers are characters like 'A', 'B', etc unlike the original GPT3.
The natural approach allows the model to explicitly compare answer options, reduces computational costs, and mitigates the effects of tokenization scheme and answer option representations on answer selection.
Sun Dec 04 2022
Tue Nov 29 2022
Wed Nov 23 2022
Tue Nov 22 2022