A minimal PyTorch implementation of YOLOv3, with support for training, inference and evaluation.
Installing from source
For normal training and evaluation we recommend installing the package from source using a poetry virtual environment.
git clone https://github.com/eriklindernoren/PyTorch-YOLOv3 cd PyTorch-YOLOv3/ pip3 install poetry --user poetry install
You need to join the virtual environment by running
poetry shell in this directory before running any of the following commands without the
poetry run prefix.
Also have a look at the other installing method, if you want to use the commands everywhere without opening a poetry-shell.
Download pretrained weights
Install via pip
This installation method is recommended, if you want to use this package as a dependency in another python project.
This method only includes the code, is less isolated and may conflict with other packages.
Weights and the COCO dataset need to be downloaded as stated above.
See API for further information regarding the packages API.
It also enables the CLI tools
yolo-test everywhere without any additional commands.
pip3 install pytorchyolo --user
Evaluates the model on COCO test dataset.
To download this dataset as well as weights, see above.
poetry run yolo-test --weights weights/yolov3.weights
|Model||mAP (min. 50 IoU)|
|YOLOv3 608 (paper)||57.9|
|YOLOv3 608 (this impl.)||57.3|
|YOLOv3 416 (paper)||55.3|
|YOLOv3 416 (this impl.)||55.5|
Uses pretrained weights to make predictions on images. Below table displays the inference times when using as inputs images scaled to 256x256. The ResNet backbone measurements are taken from the YOLOv3 paper. The Darknet-53 measurement marked shows the inference time of this implementation on my 1080ti card.
|Darknet-53 (paper)||Titan X||76|
|Darknet-53 (this impl.)||1080ti||74|
poetry run yolo-detect --images data/samples/
For argument descriptions have a look at
poetry run yolo-train --help
To train on COCO using a Darknet-53 backend pretrained on ImageNet run:
poetry run yolo-train --data config/coco.data --pretrained_weights weights/darknet53.conv.74
Track training progress in Tensorboard:
- Initialize training
- Run the command below
- Go to http://localhost:6006/
poetry run tensorboard --logdir='logs' --port=6006
Storing the logs on a slow drive possibly leads to a significant training speed decrease.
You can adjust the log directory using
--logdir <path> when running
Train on Custom Dataset
Run the commands below to create a custom model definition, replacing
<num-classes> with the number of classes in your dataset.
./config/create_custom_model.sh <num-classes> # Will create custom model 'yolov3-custom.cfg'
Add class names to
data/custom/classes.names. This file should have one row per class name.
Move the images of your dataset to
Move your annotations to
data/custom/labels/. The dataloader expects that the annotation file corresponding to the image
data/custom/images/train.jpg has the path
data/custom/labels/train.txt. Each row in the annotation file should define one bounding box, using the syntax
label_idx x_center y_center width height. The coordinates should be scaled
[0, 1], and the
label_idx should be zero-indexed and correspond to the row number of the class name in
Define Train and Validation Sets
data/custom/valid.txt, add paths to images that will be used as train and validation data respectively.
To train on the custom dataset run:
poetry run yolo-train --model config/yolov3-custom.cfg --data config/custom.data
--pretrained_weights weights/darknet53.conv.74 to train using a backend pretrained on ImageNet.
You are able to import the modules of this repo in your own project if you install the pip package
An example prediction call from a simple OpenCV python script would look like this:
import cv2 from pytorchyolo import detect, models # Load the YOLO model model = models.load_model( "<PATH_TO_YOUR_CONFIG_FOLDER>/yolov3.cfg", "<PATH_TO_YOUR_WEIGHTS_FOLDER>/yolov3.weights") # Load the image as a numpy array img = cv2.imread("<PATH_TO_YOUR_IMAGE>") # Convert OpenCV bgr to rgb img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Runs the YOLO model on the image boxes = detect.detect_image(model, img) print(boxes) # Output will be a numpy array in the following format: # [[x1, y1, x2, y2, confidence, class]]
For more advanced usage look at the method's doc strings.