You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

7.3 Popular CNN architectures: AlexNet, VGG, ResNet, and Inception

2 min readjuly 25, 2024

Convolutional Neural Networks (CNNs) have revolutionized computer vision. Key architectures like , , , and have pushed the boundaries of image recognition, introducing innovations that shaped the field.

These architectures brought game-changing ideas: ReLU activations, , deep networks with small filters, residual connections, and parallel convolution paths. Each design tackled specific challenges, paving the way for more powerful and efficient image processing systems.

Key innovations of AlexNet

Top images from around the web for Key innovations of AlexNet
Top images from around the web for Key innovations of AlexNet
  • revolutionized neural networks with non-linear f(x)=max(0,x)f(x) = max(0, x) function speeding up training and reducing vanishing gradient problem
  • Dropout regularization randomly deactivates neurons during training forcing network to learn robust features and improving generalization
  • utilized NVIDIA GTX 580 GPUs enabled training larger models on dataset
  • applied after ReLU in certain layers aided generalization and reduced overfitting

Design principles of VGG

  • Small throughout network increased non-linearity and reduced parameters
  • Deep architecture with and variants demonstrated power of network depth in improving performance
  • Consistent pattern of repeating convolutional blocks followed by simplified network design and analysis
  • Fully connected layers at end with 4096 channels in first two and 1000 in last for ImageNet classification
  • and introduced training shallower versions first using weights to initialize deeper ones

Residual connections in ResNet

  • allow information to bypass layers with formula y=F(x)+xy = F(x) + x where F(x)F(x) is
  • Addressed in very deep networks enabling training of 100+ layer architectures
  • Easier optimization as residual connections help network learn
  • Improved mitigated vanishing gradient problem in deep networks
  • used to reduce and increase dimensions reducing computational complexity

Inception architecture and parallel paths

  • Multiple convolution operations performed in parallel with outputs concatenated
  • Various filter sizes (1x1, 3x3, 5x5) captured features at different scales simultaneously
  • 1x1 convolutions reduced dimensionality and computational cost before larger convolutions
  • Pooling path included parallel max pooling operation retained important features while reducing spatial dimensions
  • as building blocks stacked to form complete architecture
  • replaced fully connected layers at end reducing parameters and preventing overfitting
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary