Naruto Hand Sign Classification

This is a project based on the well-known famous Japanese Animated Series “Naruto” by Masashi Kishimoto. In the show, the ninjas weave various hand-signs combinations as a way to use their techniques referred to as jutsus. This project attempts to use Deep Learning and Computer Vision techniques to identify what handsign an individual has made.

Live Demo Classification

The trained model performed well at an accuracy of 83.33% and up for the live demo. Classes Dog and Serpent have troubles classifying at certain situations, but the other 10 classes performed greatly. See below for a demonstration of the live classification.



We experimented on MobileNetV2, ResNet50, VGG16, and InceptionV3 architectures due to their prominence in image classification tasks. We froze all but their last few layers to take advantage of their feature extraction capabilities they learned from being trained on the imagenet dataset, effectively using transfer-learning for our image-classifcation task. Out of the four models/architectures, VGG16 performed the best at an accuracy of 93.60% on the mannually curated static test dataset as well as a minimum accuracy of 83.33% for the live demo.

Architecture of the VGG16 model is shown below.


See links below for details of the four models.

  1. MobileNetV2
  2. ResNet50
  3. VGG16
  4. InceptionV3


The data used in this project was manually collected and augmented by recording a video from a webcam utilizing OpenCV, following a number of data-augmentations such as flips and rotations to increase dataset size.

Packages Used

The requirements for the project are as follows:

  1. Python=3.7.7
  2. Tensorflow=2.8.0
  3. OpenCV=
  4. Numpy=1.21.5
  5. Pillow=9.1.0
  6. Mediapipe=0.8.10


Contributers of the project are listed below. Click the hyperlink to follow up and more projects.

  1. Saad Hossain
  2. Jaeyoung Kang
  3. Yazan Masoud
  4. Michael Frew


View Github