Sun Jan 08 2023 - Top Trending AI Papers

Mon Jan 16 2023

Sun Jan 08 2023

Does compressing activations help model parallel training?

Machine learning

Model parallelism

Improve training speed of large-scale Transformer models

Presents the first empirical study on the effectiveness of compression methods to improve the communication speed of model parallelism.

Provides insights on the effectiveness of compression methods for model parallelism, which can potentially improve the training speed of large-scale Transformer models. The study evaluates three common classes of compression algorithms and provides analysis when the model is scaled up. Future development of model parallelism compression algorithms can benefit from the provided insights.

https://arxiv.org/pdf/2301.02654.pdf

https://arxiv.org/abs/2301.02654

https://twitter.com/arankomatsuzaki/status/1612260279873855490/photo/1