Mon Mar 20 2023
Sun Mar 19 2023

CoLT5: Faster Long-Range Transformers with Conditional Computation

Transformers
Natural Language Processing
Conditional Computation
NLP document processing
Long-input SCROLLS benchmark

Proposes CoLT5, a long-input Transformer model that employs conditional computation to devote more resources to important tokens in both feedforward and attention layers, achieving stronger performance than LongT5 with much faster training and inference.

Use CoLT5 to process long documents with better performance and faster training and inference, especially in natural language processing tasks. CoLT5 can also make use of extremely long inputs with strong gains up to 64k input length.

Thu Mar 16 2023
Tue Mar 14 2023
Mon Mar 13 2023
Sun Mar 12 2023