Backpropagation is an algorithm for computing the gradient of the loss function with respect to the Neural Network parameters. It calculates the gradient layer by layer using the Chain rule. Knowing the gradients with respect to the parameters allows us to update them using some optimization algorithm (like Stochastic Gradient Descent).
Backpropagation starts from the last layer and propagates the gradient to the first one. It is only concerned with calculating the gradient, not the whole learning process. Backpropagation requires the derivative of the activation functions to be known before running it.