Understanding Derivatives in Calculus
Derivative – In simple terms, it signifies the rate of change.
Key Symbols:
- d – Indicates “a little bit of.” For example, dx means a small change in x, or du represents a slight variation in u.
So, what exactly does dy/dx convey?
dy/dx illustrates how much y alters concerning x. It provides information about both magnitude and direction of change.
Fancy word for derivative is Gradient. Derivative at any point gives the slope.
When dealing with derivatives, we have a rule known as the Weight Update Rule in Back-Propagation:
The weights are then updated in the direction that reduces the error.
Explanation:
∂Loss/∂w = +ve : Increasing the weight will enhance the loss, so to decrease the loss, we subtract from the weight.
∂Loss/∂w = -ve : Increasing the weight will reduce the loss. To minimize loss, we need to augment the weight by adding to the previous weight.
In the process of back-propagation, we follow certain steps as outlined below:
# Initialize epochs epochs = 5 # Loop over each epoch for i in range(epochs): # Loop over each training sample for j in range(X.shape[0]): # Step 1: Select 1 row (randomly) x = select_random_row(X) y = select_corresponding_label(Y) # Step 2: Predict (using forward propagation) y_pred = predict(x, weights, bias) # Step 3: Calculate Loss (using loss function) loss = calculate_loss(y_pred, y) # Step 4: Update weights & bias using Gradient Descent gradient_w = compute_gradient_w(x, y, y_pred) gradient_b = compute_gradient_b(y, y_pred) weights = weights - learning_rate * gradient_w bias = bias - learning_rate * gradient_b # Calculate average loss for the epoch avg_loss = calculate_avg_loss(epoch_losses)
For a more in-depth understanding, explore Back-Propagation: The Science of it.