ActionAI is a python library for training machine learning models to classify human action. It is a generalization of our yoga smart personal trainer, which is included in this repo as an example.
These instructions will show how to prepare your image data, train a model, and deploy the model to classify human action from image samples. See deployment for notes on how to deploy the project on a live stream.
We recommend using a virtual environment to avoid any conflicts with your system's global configuration. You can install the required dependencies via pip:
Jetson Nano Installation
We use the trt_pose repo to extract pose estimations. Please look to this repo to install the required dependencies.
You will also need to download these zipped model assets and unzip the package into the
# Assuming your python path points to python 3.x $ pip install -r requirements.txt
All preprocessing, training, and deployment configuration variables are stored in the
conf.py file in the
config/ directory. You can create your own conf.py files and store them in this directory for fast experimentation.
conf.py file included imports a LinearRegression model as our classifier by default.
After proprocessing your image data using the
preprocess.py script, you can create a model by calling the
actionModel()function, which creates a scikit-learn pipeline. Then, call the
trainModel() function with your data to train:
# Stage your model pipeline = actionModel(config.classifier()) # Train your model model = trainModel(config.csv_path, pipeline)
Arrange your image data as a directory of subdirectories, each subdirectory named as a label for the images contained in it. Your directory structure should look like this:
├── images_dir │ ├── class_1 │ │ ├── sample1.png │ │ ├── sample2.jpg │ │ ├── ... │ ├── class_2 │ │ ├── sample1.png │ │ ├── sample2.jpg │ │ ├── ... . . . .
Samples should be standard image files recognized by the pillow library.
To generate a dataset from your images, run the
$ python preprocess.py
This will stage the labeled image dataset in a csv file written to the
After reading the csv file into a dataframe, a custom scikit-learn transformer estimates body keypoints to produce a low-dimensional feature vector for each sample image. This representation is fed into a scikit-learn classifier set in the config file. This approach works well for lightweight applications that require classifying a pose like the YogAI usecase:
Run the train.py script to train and save a classifier
$ python train.py
The pickled model will be saved in the
To train a more complex model to classify a sequence of poses culminating in an action (ie. squat or spin), use the
train_sequential.py script. This script will train an LSTM model to classify movements.
$ python train_sequential.py
We've provided a sample inference script,
inference.py, that will read input from a webcam, mp4, or rstp stream, run inference on each frame, and print inference results.
If you are running on a Jetson Nano, the
iva.py script will perform multi-person tracking and activity recognition like the demo gif above Getting Started. Simply run:
$ python iva.py 0 # or if you have a video file $ python iva.py /path/to/file.mp4
If specified, this script will write a labeled video as
We've also included a script under the experimental folder,
teachable_machine.py, that supports labelling samples via a PS3 Controller on a Jetson Nano and training in real-time from a webcam stream. This will require these extra dependencies:
To test it, run:
# Using a webcam $ python experimental/teachable_machine.py /dev/video0 # Using a video asset $ python experimental/teachable_machine.py /path/to/file.mp4
This script will also write labelled data into a csv file stored in
data/ directory and produce a video asset