You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

11.1 Fundamentals of Neural Networks and Backpropagation

3 min readaugust 7, 2024

Neural networks are the backbone of deep learning, mimicking the human brain's structure. They consist of interconnected neurons that process and transmit information, enabling machines to learn complex patterns and make predictions from data.

This section covers the fundamentals of neural networks, including their structure and training process. We'll explore key concepts like activation functions, , and optimization techniques that power these powerful machine learning models.

Artificial Neural Networks

Structure of Artificial Neural Networks

Top images from around the web for Structure of Artificial Neural Networks
Top images from around the web for Structure of Artificial Neural Networks
  • Artificial fundamental building block of artificial neural networks
    • Receives inputs from other neurons or external sources
    • Applies to the inputs to determine their importance
    • Sums the weighted inputs and applies an to produce an output
  • Activation Function determines the output of a neuron based on its input
    • Common activation functions include sigmoid, tanh, and ReLU (Rectified Linear Unit)
    • Introduces non-linearity into the neural network, enabling it to learn complex patterns
  • consists of layers of neurons connected in a forward direction
    • receives the initial data
    • produces the final predictions or classifications
    • Information flows from the input layer through the to the output layer
  • Hidden Layers are the layers between the input and output layers
    • Enable the neural network to learn more complex representations of the input data
    • Increasing the number of hidden layers creates a deeper neural network (deep learning)
  • Weights and are learnable parameters of the neural network
    • Weights determine the strength of connections between neurons
    • Biases provide an additional degree of freedom to shift the activation function

Training Neural Networks

Optimization Techniques for Neural Networks

  • Backpropagation algorithm used to train neural networks
    • Calculates the gradient of the with respect to the weights and biases
    • Propagates the error backward through the network to update the parameters
    • Enables the network to learn from its mistakes and improve its performance
  • optimization algorithm used to minimize the loss function
    • Iteratively adjusts the weights and biases in the direction of steepest descent of the loss function
    • (SGD) performs updates based on small batches of training data
    • Variants like , , and adapt the for each parameter
  • Loss Function measures the discrepancy between the predicted and actual outputs
    • Common loss functions include (MSE) for regression and for classification
    • Provides a quantitative measure of how well the neural network is performing
  • Learning Rate determines the step size at which the weights and biases are updated during gradient descent
    • Higher learning rates lead to faster convergence but may overshoot the optimal solution
    • Lower learning rates result in slower convergence but are more likely to find the optimal solution

Techniques for Improving Model Performance

  • occurs when a model performs well on the training data but poorly on unseen data
    • Happens when the model learns noise or specific patterns in the training data that do not generalize well
    • techniques can help mitigate overfitting by adding constraints to the model
  • occurs when a model is too simple to capture the underlying patterns in the data
    • Results in poor performance on both the training and test data
    • Increasing the model complexity (e.g., adding more layers or neurons) can help address underfitting
  • Regularization techniques add constraints to the model to prevent overfitting
    • (Lasso) adds the absolute values of the weights to the loss function
    • (Ridge) adds the squared values of the weights to the loss function
    • Encourages the model to learn simpler and more generalizable patterns
  • is a regularization technique that randomly drops out (sets to zero) a fraction of neurons during training
    • Prevents neurons from relying too heavily on specific inputs and encourages them to learn robust features
    • Helps reduce overfitting by introducing noise and preventing complex co-adaptations of neurons
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary