This is the implementation of “Training deep neural networks via direct loss minimization” published at ICML 2016 in PyTorch. The implementation targets the 0-1 loss.
The repository consists of 3 script files:
main.py: a demonstration to train MNIST with 0-1 loss
ConvNet.py: a class defining the architecture of the model used
utils.py: consists of the function used to estimate the gradient.
One can run the demonstration in
main.py by copying and modifying (e.g. location to save checkpoints) the command at the top of the script. Here are the results I got when training on MNIST for 100 epochs.
Figure 1. Training MNIST with 0-1 loss for 100 epochs.
Figure 2. Testing results evaluated at each epoch: (top) cross-entropy loss, and (bottom) prediction accuracy.
If you want to estimate the gradient of 0-1 loss and integrate into your code, please import the
grad_estimation function in