Research An Image is Worth 16x16 Words: Transformers for Image Recognition (Paper Explained) Large Transformer trained on large datasets outperform CNN-based architectures and achieve state of the art results on image recognition tasks
Research Pre-training via Paraphrasing (Paper Explained) Transformer model pre-trained on document retrieval and reconstruction performs well on both fine-tuned and zero-shot downstream tasks