Sun Aug 14 2022 - Top Trending AI Papers

Mon Aug 15 2022

Sun Aug 14 2022

BEIT V2: Masked Image Modeling with Vector-Quantized Visual Tokenizers

Image Processing

Computer Vision

Machine Learning

image classification

semantic segmentation

self-supervised representation learning

Proposes to use a semantic-rich visual tokenizer as the reconstruction target for masked prediction, providing a systematic way to promote MIM from pixel-level to semantic-level.

Can provide a more effective way of utilizing high-level semantics in self-supervised representation learning, leading to better image classification and semantic segmentation results.

https://arxiv.org/pdf/2208.06366.pdf

https://arxiv.org/abs/2208.06366

https://twitter.com/arankomatsuzaki/status/1558976412916207616/photo/1