Activation Functions in Neural Networks
Introduction
Activation functions play a crucial role in the functioning of neural networks. They introduce non-linearity into the network, enabling it to learn complex patterns. Without activation functions, the neural network would behave like a linear regressor, limiting its ability to model real-world data.
Types of Activation Functions
There are several types of activation functions, each with its unique characteristics and applications. Below, we will explore some of the most commonly used activation functions in neural networks.
1. Sigmoid Activation Function
The Sigmoid function is defined as:
σ(x) = 1 / (1 + e^(-x))
It squashes the input value to a range between 0 and 1.
Example:
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
print(sigmoid(np.array([-1, 0, 1, 2])))
2. Tanh Activation Function
The Tanh function is defined as:
tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x))
It squashes the input value to a range between -1 and 1.
Example:
import numpy as np
def tanh(x):
return np.tanh(x)
print(tanh(np.array([-1, 0, 1, 2])))
3. ReLU Activation Function
The Rectified Linear Unit (ReLU) function is defined as:
ReLU(x) = max(0, x)
It outputs the input if it is positive; otherwise, it outputs zero.
Example:
import numpy as np
def relu(x):
return np.maximum(0, x)
print(relu(np.array([-1, 0, 1, 2])))
4. Leaky ReLU Activation Function
The Leaky ReLU function is a variant of ReLU and is defined as:
Leaky ReLU(x) = max(αx, x)
It outputs the input if it is positive; otherwise, it outputs a small fraction (α) of the input.
Example:
import numpy as np
def leaky_relu(x, alpha=0.01):
return np.where(x > 0, x, alpha * x)
print(leaky_relu(np.array([-1, 0, 1, 2])))
5. Softmax Activation Function
The Softmax function is commonly used in the output layer of classification models. It is defined as:
softmax(x) = e^(x_i) / Σ(e^(x_j)) for j = 1 to n
It converts the input values into probabilities that sum to 1.
Example:
import numpy as np
def softmax(x):
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum()
print(softmax(np.array([1, 2, 3])))
Conclusion
Activation functions are essential for introducing non-linearity into neural networks, enabling them to learn complex mappings from inputs to outputs. Each activation function has its strengths and weaknesses, and the choice of activation function can significantly impact the performance of a neural network. Understanding the properties of different activation functions can help in designing more effective neural network models.