Code for “Learning to Segment Rigid Motions from Two Frames”.

** This is a partial release with inference and evaluation code. The project is still being tested and documented. There might be implemention changes in the future release. Thanks for your interest.

Visuals on Sintel/KITTI/Coral (not temporally smoothed):

If you find this work useful, please consider citing:

  title={Learning to Segment Rigid Motions from Two Frames},
  author={Yang, Gengshan and Ramanan, Deva},
  journal={arXiv preprint arXiv:2101.03694},

Data and precomputed results


Additional inputs (coral reef images) and precomputed results are hosted on google drive. Run (assuming you have installed gdown)

gdown https://drive.google.com/uc?id=1Up2cPCjzd_HGafw1AB2ijGmiKqaX5KTi -O ./input.tar.gz
gdown https://drive.google.com/uc?id=12C7rl5xS66NpmvtTfikr_2HWL5SakLVY -O ./rigidmask-sf-precomputed.zip
tar -xzvf ./input.tar.gz 
unzip ./rigidmask-sf-precomputed.zip -d precomputed/

To compute the results in Tab.1, Tab.2 on KITTI,

python eval/eval_seg.py  --path precomputed/$modelname/  --dataset 2015
python eval/eval_sf.py   --path precomputed/$modelname/  --dataset 2015


The code is tested with python 3.8, pytorch 1.7.0, and CUDA 10.2. Install dependencies by

conda env create -f rigidmask.yml
conda activate rigidmask_v0
pip install kornia
python -m pip install detectron2 -f \

Compile DCNv2 and ngransac.

cd models/networks/DCNv2/; python setup.py install; cd -
cd models/ngransac/; python setup.py install; cd -

Pretrained models

Download pre-trained models to ./weights (assuming gdown is installed),

mkdir weights
mkdir weights/rigidmask-sf
mkdir weights/rigidmask-kitti
gdown https://drive.google.com/uc?id=1H2khr5nI4BrcrYMBZVxXjRBQYBcgSOkh -O ./weights/rigidmask-sf/weights.pth
gdown https://drive.google.com/uc?id=1sbu6zVeiiK1Ra1vp_ioyy1GCv_Om_WqY -O ./weights/rigidmask-kitti/weights.pth
modelname training set flow model flow err. (K:Fl-err/EPE) motion-in-depth err. (K:1e4) seg. acc. (K:obj/K:bg/S:bg)
rigidmask-sf (mono) SF C+SF+V 10.9%/3.128px 120.4 90.71%/97.05%/86.72%
rigidmask-kitti (stereo) SF+KITTI C+SF+V->KITTI 4.1%/1.155px 49.7 95.58%/98.91%/-

** C: FlythingChairs, SF(SceneFlow including FlyingThings, Monkaa, and Driving, K: KITTI scene flow training set, V: VIPER, S: Sintel.


Run and visualize rigid segmentation of coral reef video, (pass –refine to turn on rigid motion refinement). Results will be saved at ./weights/$modelname/seq/ and a output-seg.gif file will be generated in the current folder.

CUDA_VISIBLE_DEVICES=1 python submission.py --dataset seq-coral --datapath input/imgs/coral/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth --testres 1
python eval/generate_visual.py --datapath weights/$modelname/seq-coral/ --imgpath input/imgs/coral

Run and visualize two-view depth estimation on kitti video, a output-depth.gif will be saved to the current folder.

CUDA_VISIBLE_DEVICES=1 python submission.py --dataset seq-kitti --datapath input/imgs/kitti_2011_09_30_drive_0028_sync_11xx/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth --testres 1.2 --refine
python eval/generate_visual.py --datapath weights/$modelname/seq-kitti/ --imgpath input/imgs/kitti_2011_09_30_drive_0028_sync_11xx
python eval/render_scene.py --inpath weights/rigidmask-sf/seq-kitti/pc0-0000001110.ply

Run and evaluate kitti-sceneflow (monocular setup, Tab. 1 and Tab. 2),

CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015 --datapath path-to-kitti-sceneflow-training   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --testres 1.2 --refine
python eval/eval_seg.py   --path weights/$modelname/  --dataset 2015
python eval/eval_sf.py   --path weights/$modelname/  --dataset 2015
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset sintel_mrflow_val --datapath path-to-sintel-training   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --testres 1.5 --refine
python eval/eval_seg.py   --path weights/$modelname/  --dataset sintel
python eval/eval_sf.py   --path weights/$modelname/  --dataset sintel

Run and evaluate kitti-sceneflow (stereo setup, Tab. 6),

CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015 --datapath path-to-kitti-sceneflow-images   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --disp_path input/disp/kittisf-train-hsm-disp/ --fac 2 --maxdisp 512 --refine --sensor stereo
python eval/eval_seg.py   --path weights/$modelname/  --dataset 2015
python eval/eval_sf.py    --path weights/$modelname/  --dataset 2015

To generate results for kitti-sceneflow benchmark (stereo setup, Tab. 3),

mkdir ./benchmark_output
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015test --datapath path-to-kitti-sceneflow-images  --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --disp_path input/disp/kittisf-test-ganet-disp/ --fac 2 --maxdisp 512 --refine --sensor stereo

Training (todo)

Acknowledge (incomplete)