Wed Jun 22 2022
Tue Jun 21 2022

Insights into Pre-training via Simpler Synthetic Tasks

Pre-training
Machine learning
Natural language processing
Language processing
Text generation
Speech recognition

Pre-training a T5 on a synthetic dataset substantially outperforms over training from scratch and achieves up to 67% of the benefits of natural pre-training.

Using synthetic pre-training tasks can achieve significant gains in downstream tasks and can be simplified while still retaining much of its gains.

Sun Jun 19 2022
Thu Jun 16 2022
Wed Jun 15 2022
Tue Jun 14 2022