RealTime Sign Language Detection using Action Recognition
Real-Time Sign Language is commonly predicted using models whose architecture consists of multiple CNN layers followed by multiple LSTM layers. However , the accuracy of these state of the art models is pretty low. On the other hand, this approach , Mediapipe Holistic with LSTM Model gives a much better accuracy. This approach produced better results with very less amount of data . Since this model trained on fewer parameters, it trained much faster thus resulting in lesser computation time.
This project is divided into two parts:
- Keypoints extraction using MediaPipe Holistic
- LSTM Model trained on these keypoints to predict realtime sign language using video sequences.
Data is collected using MediaPipe Holistic for 3 actions :
- I Love You
30 frames have been collected for each action and 30 sequences for each frame have been collected from real time actions using
Computer Vision and
MediaPipe Holistic. For each sequence , 1662 keypoints have been extracted.
- Face Landmarks – 468*3
- Pose Landmarks – 33*4
- Left Hand Landmarks – 21*3
- Right Hand Landmarks – 21*3
The dataset can be accessed from the
LSTM Model is trained using the extracted keypoints from the
Feature_Extraction folder and later used for real time predictions.
The Weights of the model are saved in the
How to Use
Clone the repository using :
$ git clone https://github.com/rishusiva/Pose-Network
Install the requirements using:
$ cd Pose-Network/ $ pip install -r requirements.txt
To Predict Sign Languages in Real Time , run :
$ cd Pose-Network/Code $ python3 realtime_testing.py
- Our LSTM Model, after training for only 100 epochs, has an accuracy of 70%
- It produced an accuracy score of 1.0 on a test set of 5 images.
- Our Trained LSTM Model is then used for real time testing.
- Rishikesh Sivakumar