Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement.


Xin Liu, Josh Fromm, Shwetak Patel, Daniel McDuff, “Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement”, NeurIPS 2020, Oral Presentation (105 out of 9454 submissions)


Telehealth and remote health monitoring have become increasingly important during the SARS-CoV-2 pandemic and it is widely expected that this will have a lasting impact on healthcare practices. These tools can help reduce the risk of exposing patients and medical staff to infection, make healthcare services more accessible, and allow providers to see more patients. However, objective measurement of vital signs is challenging without direct contact with a patient. We present a video-based and on-device optical cardiopulmonary vital sign measurement approach. It leverages a novel multi-task temporal shift convolutional attention network (MTTS-CAN) and enables real-time cardiovascular and respiratory measurements on mobile platforms. We evaluate our system on an ARM CPU and achieve state-of-the-art accuracy while running at over 150 frames per second which enables real-time applications. Systematic experimentation on large benchmark datasets reveals that our approach leads to substantial (20%-50%) reductions in error and generalizes well across datasets.

Waveform Samples






  title={Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement},
  author={Liu, Xin and Fromm, Josh and Patel, Shwetak and McDuff, Daniel},
  journal={arXiv preprint arXiv:2006.03790},


Try out our live demo via link here.

Our demo code:


If you want to use TVM, pleaea follow this tutorial to set it up. Then, you will need to replace the code in incubator-tvm/python/tvm/relay/frontend/ with our code/ We implemented required tensor operations for attention, tensor shift module used in our models.


python code/ --exp_name test --exp_name [e.g., test] --data_dir [DATASET_PATH] --temporal [e.g., MMTS_CAN]


python code/ --video_path [VIDEO_PATH]

The default video sampling rate is 30Hz.


During the inference, the program will generate a sample pre-processed frame. Please ensure it is in portrait orientation. If not, you can comment out line 30 (rotation) in the


Tensorflow 2.0+

conda create -n tf-gpu tensorflow-gpu cudatoolkit=10.1 -- this command takes care of both CUDA and TF environments.

pip install opencv-python scipy numpy matplotlib

Ifpip install opencv-python does not work, I found these commands always work on my mac.

conda install -c menpo opencv -y
pip install opencv-python