A Unified Transformer Framework for Group-based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection
UFO is a simple and Unified framework for addressing Co-Object Segmentation tasks: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection. Humans tend to mine objects by learning from a group of images or a several frames of video since we live in a dynamic world. In computer vision area, many researches focus on co-segmentation (CoS), co-saliency detection (CoSD) and video salient object detection (VSOD) to discover the co-occurrent objects. However, previous approaches design different networks on these tasks separately, which lower the upper bound on the ease of use of deep learning frameworks. In this paper, we introduce a unified framework to tackle these issues, term as UFO (Unified Framework for Co-Object Segmentation). All tasks share the same framework.
Task & Framework
torch >= 1.7.0 torchvision >= 0.7.0 python3
Training on video (w/o flow) . We load the weight pre-trained on the static image dataset, and use DAVIS and FBMS to train our framework.
python finetune.py --model=models/image_best.pth --use_flow=False
Training on video (w/ flow). The same as above, then we use DAVIS_flow and FBMS_flow to train our network.
python finetune.py --model=models/image_best.pth --use_flow=True
Generate the image results [checkpoint]
python test.py --model=models/image_best.pth --data_path=CoSdatasets/MSRC7/ --output_dir=CoS_results/MSRC7 --task=CoS_CoSD
Generate the video results [checkpoint]
python test.py --model=models/video_best.pth --data_path=VSODdatasets/DAVIS/ --output_dir=VSOD_results/wo_optical_flow/DAVIS --task=VSOD
Generate the video results with optical flow [checkpoint]
python test.py --model=models/video_flow_best.pth --data_path=VSODdatasets/DAVIS_flow/ --output_dir=VSOD_results/w_optical_flow/w_optical_flow --use_flow=True --task=VSOD
- Pre-Computed Results: Please download the prediction results of our framework form the Results section.
- Evaluation Toolbox: We use the standard evaluation toolbox from COCA benchmark.
- [Optional] Single Object Tracking (SOT) on GOT-10k val set
python demo.py --data_path=./demo_mp4/video/kobe.mp4 --output_dir=./demo_mp4/result
Our project references the codes in the following repos.