Detecto is a Python package that allows you to build fully-functioning computer vision and object detection models with just 5 lines of code. Inference on still images and videos, transfer learning on custom datasets, and serialization of models to files are just a few of Detecto's features. Detecto is also built on top of PyTorch, allowing an easy transfer of models between the two libraries.

The table below shows a few examples of Detecto's performance:

Still Image Video
apple_orange demo


To install Detecto using pip, run the following command:

pip3 install detecto

Installing with pip should download all of Detecto's dependencies automatically.
However, if an issue arises, you can manually download the dependencies listed in the requirements.txt file.


The power of Detecto comes from its simplicity and ease of use. Creating and running a pre-trained
Faster R-CNN ResNet-50 FPN
from PyTorch's model zoo takes 4 lines of code:

from detecto.core import Model
from detecto.visualize import detect_video

model = Model()  # Initialize a pre-trained model
detect_video(model, 'input_video.mp4', 'output.avi')  # Run inference on a video

Below are several more examples of things you can do with Detecto:

Transfer Learning on Custom Datasets

Most of the times, you want a computer vision model that can detect custom objects. With Detecto, you can train a model on a custom dataset with 5 lines of code:

from detecto.core import Model, Dataset

dataset = Dataset('custom_dataset/')  # Load images and label data from the custom_dataset/ folder

model = Model(['dog', 'cat', 'rabbit'])  # Train to predict dogs, cats, and rabbits

model.predict(...)  # Start using your trained model!

Inference and Visualization

When using a model for inference, Detecto returns predictions in an easy-to-use format and provides several visualization tools:

from detecto.core import Model
from detecto import utils, visualize

model = Model()

image = utils.read_image('image.jpg')  # Helper function to read in images

labels, boxes, scores = model.predict(image)  # Get all predictions on an image
predictions = model.predict_top(image)  # Same as above, but returns only the top predictions

print(labels, boxes, scores)

visualize.show_labeled_image(image, boxes, labels)  # Plot predictions on a single image

images = [...]
visualize.plot_prediction_grid(model, images)  # Plot predictions on a list of images

visualize.detect_video(model, 'input_video.mp4', 'output.avi')  # Run inference on a video

Advanced Usage

If you want more control over how you train your model, Detecto lets you do just that:

from detecto import core, utils
from torchvision import transforms
import matplotlib.pyplot as plt

# Convert XML files to CSV format
utils.xml_to_csv('training_labels/', 'train_labels.csv')
utils.xml_to_csv('validation_labels/', 'val_labels.csv')

# Define custom transforms to apply to your dataset
custom_transforms = transforms.Compose([

# Pass in a CSV file instead of XML files for faster Dataset initialization speeds
dataset = core.Dataset('train_labels.csv', 'images/', transform=custom_transforms)
val_dataset = core.Dataset('val_labels.csv', 'val_images')  # Validation dataset for training

# Create your own DataLoader with custom options
loader = core.DataLoader(dataset, batch_size=2, shuffle=True) 

model = core.Model(['car', 'truck', 'boat', 'plane'])
losses =, val_dataset, epochs=15, learning_rate=0.001, verbose=True)

plt.plot(losses)  # Visualize loss throughout training'model_weights.pth')  # Save model to a file

# Directly access underlying torchvision model for even more control
torch_model = model.get_internal_model()

For more examples, visit the docs, which includes a quickstart tutorial.

Alternatively, check out the demo on Colab.

API Documentation

The full API documentation can be found at
The docs are split into three sections, each corresponding to one of Detecto's modules:


The detecto.core module contains the central classes of the package: Dataset, DataLoader, and Model.
These are used to read in a labeled dataset and train a functioning object detection model.


The detecto.utils module contains a variety of useful helper functions.
With it, you can read in images, convert XML files into CSV files, apply standard transforms to images, and more.


The detecto.visualize module is used to display labeled images, plot predictions, and run object detection on videos.