This is the official PyTorch implementation for paper “CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation”. (CVPR 2022)
- 2022-03-07: We release the code and the pretrained weights.
- 2022-03-03: Our paper is accepted by CVPR 2022.
- 2021-11-20: Our paper is available at https://arxiv.org/abs/2111.10502
- 2021-11-04: Our method ranked #1 on the leaderboard of KITTI Scene Flow.
|FlyingThings3D -> Driving||driving.pt|
|FlyingThings3D -> Driving -> KITTI||kitti.pt|
Here, we provide precomputed results for the submission to the online benchmark of KITTI Scene Flow.
Create a PyTorch environment using
conda create -n camliflow python=3.7 conda activate camliflow conda install pytorch==1.10.2 torchvision==0.11.3 cudatoolkit=10.2 -c pytorch
Install other dependencies:
pip install opencv-python open3d tensorboard omegaconf
Compile CUDA extensions for faster training and evaluation (optional):
cd models/csrc python setup.py build_ext --inplace
NG-RANSAC is also required if you want to evaluate on KITTI. Please follow https://github.com/vislearn/ngransac to install the library.
First, download and preprocess the dataset (see
preprocess_flyingthings3d_subset.py for detailed instructions):
python preprocess_flyingthings3d_subset.py --input_dir /mnt/data/flyingthings3d_subset
Then, download the pretrained weights things.pt and save it to
To reproduce the results in Table 1 (see the main paper):
python eval_things.py --weights checkpoints/things.pt
First, download the following parts:
- Main data: data_scene_flow.zip
- Calibration files: data_scene_flow_calib.zip
- Disparity estimation (from GA-Net): disp_ganet.zip
- Semantic segmentation (from DDR-Net): semantic_ddr.zip
Unzip them and organize the directory as follows:
/mnt/data/kitti_scene_flow ├── testing │ ├── calib_cam_to_cam │ ├── calib_imu_to_velo │ ├── calib_velo_to_cam │ ├── disp_ganet │ ├── flow_occ │ ├── image_2 │ ├── image_3 │ ├── semantic_ddr └── training ├── calib_cam_to_cam ├── calib_imu_to_velo ├── calib_velo_to_cam ├── disp_ganet ├── disp_occ_0 ├── disp_occ_1 ├── flow_occ ├── image_2 ├── image_3 ├── obj_map ├── semantic_ddr
Then, download the pretrained weights kitti.pt and save it to
To reproduce the results without rigid background refinement (SF-all: 5.62%):
python kitti_submission.py --weights checkpoints/kitti.pt
To reproduce the results with rigid background refinement (SF-all: 4.43%):
python kitti_submission.py --weights checkpoints/kitti.pt --refine
You should get the same results as the precomputed ones.
You need to preprocess the FlyingThings3D dataset before training (see
preprocess_flyingthings3d_subset.pyfor detailed instructions).
First, pretrain the model on FlyingThings3D with the L2-norm loss:
python train.py --config conf/train/pretrain.yaml
Then, finetune the model on FlyingThings3D with the robust loss:
python train.py --config conf/train/things.yaml --weights outputs/pretrain/ckpts/best.pt
The entire training process takes about 10 days on 4 Tesla V100-SXM2-32GB GPUs. When the training is finished, the best weights should be saved to
You need to preprocess the Driving dataset before training (see
preprocess_driving.pyfor detailed instructions).
We adopt the training set schedule of
FlyingThings3D -> Driving -> KITTI. Specifically, we first train our model on FlyingThings3D (see the above section for more details), then we finetune our model on Driving and KITTI sequentially.
First, finetune the model on Driving using the weights trained on FlyingThings3D:
python train.py --config conf/train/driving.yaml --weights outputs/things/ckpts/best.pt
Then, finetune the model on KITTI using the weights trained on Driving:
python train.py --config conf/train/kitti.yaml --weights outputs/driving/ckpts/best.pt
The entire training process takes about 0.5 days on 2 Tesla V100-SXM2-32GB GPUs. When the training is finished, the best weights should be saved to