NOTE: Still in progress, will update constantly, thank you for your attention!

This repo is the a codebase of the Joint Detection and Embedding (JDE) model. JDE is a fast and high-performance multiple-object tracker that learns the object detection task and appearance embedding task simutaneously in a shared neural network. Techical details are described in our arXiv preprint paper. By using this repo, you can simply achieve MOTA 64%+ on the "private" protocol of MOT-16 challenge, and with a near real-time speed at 18~24 FPS (Note this speed is for the entire system, including the detection step! ) .

We hope this repo will help researches/engineers to develop more practical MOT systems. For algorithm development, we provide training data, baseline models and evaluation methods to make a level playground. For application usage, we also provide a small video demo that takes raw videos as input without any bells and whistles.


  • Python 3.6
  • Pytorch >= 1.0.1
  • syncbn (Optional, compile and place it under utils/syncbn, or simply replace with nn.BatchNorm here)
  • maskrcnn-benchmark (Their GPU NMS is used in this project)
  • python-opencv
  • ffmpeg (Optional, used in the video demo)

Video Demo





python --input-video path/to/your/input/video --weights path/to/model/weights
               --output-format video --output-root path/to/output/root