Multi-Sample Dropout: Method that reduces the training time by 4 times

Kushajveer Singh
2 min readJun 17, 2019

Multi-Sample Dropout introduced in the paper Multi-Sample Dropout for Accelearted Training and Better Generalization is a new way to expand the traditional Dropout by using multiple dropout masks for the same mini-batch.

The original dropout creates a randomly selected subset (called a dropout sample) from the input in each training iteration while the multi-sample dropout creates multiple dropout samples. The loss is calculated for each sample, and then the losses are averaged to obtain the final loss.

The paper shows that multi-sample dropout significantly accelerates training by reducing the number of iterations until convergence for image classification tasks using the old way of training neural networks i.e. using a constant learning rate and decaying it. So I test this method for cyclic learning and see if I can reproduce the results from the paper.

Note:- If you are not familiar with cyclic learning I wrote a jupyter notebook explaining the 4 key papers that introduced all the techniques by Leslie N. Smith, Reproducing Leslie N. Smith’s papers using fastai.

Table of Contents:

  1. Load CIFAR-100 (initially I test using CIFAR-100)
  2. Resnet-56
  3. How to implement multi-sample dropout in model
  4. Diversity among samples is needed
  5. Code for Multi-Sample Dropout
  6. Code for Multi-Sample Dropout Loss function
  7. Get baseline without Multi-Sample Dropout

All the code and detailed discussion on the topic has been covered in this jupyter notebook.

I am thinking about shifting to jupyter notebooks for these tutorials, as it is easier to experiment in the notebooks.

So if you want to follow along and know when I am making a new notebook, you can check https://kushajveersingh.github.io/notebooks/. I will keep updating the list as I make new notebooks.

I will still make Medium posts but most of the paper implementations will be done in the Jupyter notebooks.

--

--

Kushajveer Singh

Software Engineer | Full-Stack Engineer | React | Next.js | PostgreSQL | Python | Trained by Senior Google Engineers in Software Engineering Best Practices