Pruning (artificial neural network)

In deep learning, pruning is the practice of removing parameters from an existing artificial neural network. The goal of this process is to reduce the size of the neural network whilst maintaining accuracy. This can be compared to the biological process of synaptic pruning which takes place in mammalian brains during development.

Node (neuron) pruning

A basic algorithm for pruning is as follows:

Evaluate the importance of each neuron.
Rank the neurons according to their importance.
Remove the least important neuron.
Check a termination condition to see whether to continue pruning.

Edge (weight) pruning

Most work on neural network pruning does not remove full neurons or layers. Instead, it focuses on removing the most insignificant weights, namely, setting their values to zero. This can either be done globally by comparing weights from all layers in the network or locally by comparing weights in each layer separately.
Different metrics can be used to measure the importance of each weight. Weight magnitude as well as combinations of weight and gradient information are commonly used metrics.
Early work suggested also to change the values of non-pruned weights.

When to prune the neural network?

Pruning can be applied at three different stages: before training, during training, or after training. When pruning is performed during or after training, additional fine-tuning epochs are typically required. Each approach involves different trade-offs between accuracy and computational cost.