Learning and Building Convolutional Neural Networks using PyTorch

Image Classification Using Deep Learning

Learning and Building Convolutional Neural Networks using PyTorch. Models, selected are based on number of citation of the paper with the help of paperwithcode along with unique idea deviating from typical architecture like using transformers for CNN.

Image Classification is a fundamental computer vision task with huge scope in various applications like self driving cars, medical imaging, video frame prediction etc. Each model is either a new idea or built upon exisiting idea. We'll capture each of these idea and experiment and benchmark on predefined criteria.

Try Out In Google Colab

? Papers With Implementation

Base Config: { epochs: 10, lr: 0.001, batch_size: 128, img_resolution: 224, optim: adam }.

Some architecture like SqueezeNet, ShuffleNet, InceptionV3, EfficientNet, Darknet53 and others didn't work at base config because of increased complexity of the architecture, thus by reducing the batch size the architecture was executed in Google Colab and Kaggle.

Optimizer script: Includes the approx setup of the optimizer used in the original paper. Few setup led to Nan loss and were tweaked to make it work.

I've noticed that Google Colab has 12GB GPU while Kaggle has 16 GB GPU. So in worst case scenario, I've reduced the batch size in accordance to fit the Kaggle GPU. Just to mention, I use RTX2070 8GB.

CNN Based	Accuracy	Parameters	FLOPS	Configuration	LR-Scheduler(Accuracy)
AlexNet	71.27	58.32M	1.13GFlops	-	CyclicLR(79.56)
VGGNet	75.93	128.81M	7.63GFlops	-	-
Network In Network	71.03	2.02M	0.833GFlops	-	-
ResNet	83.39	11.18M	1.82GFlops	-	CyclicLR(74.9)
DenseNet-Depth40	68.25	0.18M	-	B_S = 8	-
MobileNetV1	81.72	3.22M	0.582GFlops	-	-
MobileNetV2	83.99	2.24M	0.318GFlops	-	-
GoogLeNet	80.28	5.98M	1.59GFlops	-	-
InceptionV3	-	-	209.45GFlops	H_C_R
Darknet-53	-	-	7.14GFlops	H_C_R
Xception	85.9	20.83M	4.63GFlops	B_S = 96	-
ResNeXt	-		69.41GFlops	H_C_R
SENet	83.39	11.23M	1.82GFlops	-	CyclicLR(78.10)
SqueezeNet	62.2	0.73M	2.64GFlops	B_S = 64
ShuffleNet			2.03GFlops	B_S = 32	-
EfficientNet-B0	-	4.02M	0.4GFlops
Transformer Based
ViT	-	53.59M	-	-	WarmupCosineSchedule(55.34)
MLP Based
MLP-Mixer	68.52	13.63M	-	-	WarmupLinearSchedule(69.5)
ResMLP	65.5	14.97M	-	-	-

B_S - Batch Size \
H_C_R - High Compute Required

Note: Marked few cells as high compute required because even with batch_size = 8, the kaggle compute was not enough. The performance of the model especially with regards to accuracy is less because the model runs only for 10 epochs, with more epochs the model converges further. Learning rate scheduler is underestimated, try out various learning rate scheduler to get the maximum out of the network.

Google Colab Notebook to tune hyperparameters with Weights and Biases Visualizations

Create Environment

python -m venv CNNs 
source CNNs/bin/activate 
git clone https://github.com/Mayurji/CNNs-PyTorch.git

Installation

pip install -r requirements.txt

Run

python main.py --model=resnet

To Save Model

python main.py --model=resnet --model_save=True

To Create Checkpoint

python main.py --model=resnet --checkpoint=True

Note: Parameters can be changed in YAML file. The module supports only two datasets, MNIST and CIFAR-10, but you can modify the dataset file and include any other datasets.