
Pytorch implementation of popular Attention Mechanisms, Vision Transformers, MLP-Like models and CNNs.

AttentionCNNsMLPsViTs, attention-mechanisms, cnns, mlp-networks, pytorch-implementation, vits
pip install pytorch-attention==1.0.0



This codebase is a PyTorch implementation of various attention mechanisms, CNNs, Vision Transformers and MLP-Like models.

If it is helpful for your work, please


Attention mechanisms

  • Squeeze-and-Excitation Networks (CVPR 2018) pdf
  • CBAM: convolutional block attention module (ECCV 2018) pdf
  • Bam: Bottleneck attention module(BMVC 2018) pdf
  • A2-nets: Double attention networks (NeurIPS 2018) pdf
  • Srm : A style-based recalibration module for convolutional neural networks (ICCV 2019) pdf
  • Gcnet: Non-local networks meet squeeze-excitation networks and beyond (ICCVW 2019) pdf
  • Linear Context Transform Block (AAAI 2020) pdf
  • Ecanet: Efficient channel attention for deep convolutional neural networks (CVPR 2020) pdf
  • Rotate to Attend: Convolutional Triplet Attention Module (WACV 2021) pdf
  • Gaussian Context Transformer (CVPR 2021) pdf
  • Coordinate Attention for Efficient Mobile Network Design (CVPR 2021) pdf

Vision Transformers

  • An image is worth 16x16 words: Transformers for image recognition at scale (ICLR 2021) pdf
  • XCiT: Cross-Covariance Image Transformer (NeurIPS 2021) pdf
  • Rethinking Spatial Dimensions of Vision Transformers (ICCV 2021) pdf
  • CvT: Introducing Convolutions to Vision Transformers (ICCV 2021) pdf
  • CMT: Convolutional Neural Networks Meet Vision Transformers (CVPR 2022) pdf
  • DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition (TMM 2023) pdf
  • BViT: Broad Attention based Vision Transformer (TNNLS 2023) pdf

Convolutional Neural Networks(CNNs)

  • Deep Residual Learning for Image Recognition (CVPR 2016) pdf
  • Densely Connected Convolutional Networks (CVPR 2017) pdf
  • Deep Pyramidal Residual Networks (CVPR 2017) pdf
  • MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (CVPR 2017) pdf
  • MobileNetV2: Inverted Residuals and Linear Bottlenecks (CVPR 2018) pdf
  • Searching for MobileNetV3 (ICCV 2019) pdf
  • Res2Net: A New Multi-scale Backbone Architecture (TPAMI 2019) pdf
  • GhostNet: More Features from Cheap Operations (CVPR 2020) pdf
  • A ConvNet for the 2020s (CVPR 2022) pdf

MLP-Like Models

  • MLP-Mixer: An all-MLP Architecture for Vision (NeurIPS 2021) pdf
  • Pay Attention to MLPs (NeurIPS 2021) pdf
  • Global Filter Networks for Image Classification (NeurIPS 2021) pdf
  • Sparse MLP for Image Recognition: Is Self-Attention Really Necessary? (AAAI 2022) pdf
  • DynaMixer: A Vision MLP Architecture with Dynamic Mixing (ICML 2022) pdf
  • Patches Are All You Need? (TMLR 2022) pdf
  • Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition (TPAMI 2022) pdf