MobileNet in FPGA

Generator of verilog description for FPGA MobileNet implementation. There are several pre-trained models available for frequent tasks like detection of people, cars and animals. You can train your own model easily on your dataset using code from this repository and have the same very fast detector on FPGA working in real time for your own task.

Software requirements

Python 3.*, keras 2.2.4, tensorflow, kito

Hardware requirements

  1. TFT-screen ILI9341 Size: 2.8", Resolution: 240x320, Interface: SPI
  2. Camera OV5640. Active array size: 2592 x 1944
  3. OpenVINO Starter Kit. Cyclone V (301K LE, 13,917 Kbits embedded memory)


How to run

  1. python3 - it will create training files using Open Images Dataset (OID).
  2. python3 - run training process. Will create weights for model and output accuracy of model.
  3. python3 - batchnorm fusion and rescale model on range (0, 1) instead of (0, 6). Returns new rescaled model

Note: You can skip part 1, 2 and 3 if you use our pretrained weight files below

  1. python3 - code to find optimal bit for feature maps, weights and biases, also returns maximum overflow for weights and biases over 1.0 value.
  2. python3 - generate weights in verliog format using optimal bits from previous step
  3. python3 - generate intermediate feature maps for each layer and details about first pixel calculation (can be used for debug)
  4. python3 - generate verilog based on given model and parameters like number of convolution blocks


  • 2019.10.04 We greatly improved speed of image reading and preprocessing. Now it takes only 5% of total time instead of 77% earlier. Speed for 8 convolution version of device increased from ~10 FPS up to ~ 40 FPS.

Pre-trained models

People detector (128px) Cars detector (128px) Animals detector (128px)
Accuracy (%) 84.42 96.31 89.67
Init model (can be used for training and fine-tuning) people.h5 cars.h5 animals.h5
Reduced and rescaled model people.h5 cars.h5 animals.h5
Optimal bits found 12, 11, 10, 7, 3 10, 9, 8, 7, 3 12, 11, 10, 7, 3
Quartus project (verilog) link link link

Connection of peripherals