Does Decentralized Learning with non-IID Unlabeled Data Benefit from Self Supervision?

This is the PyTorch implementation of the paper “Does Decentralized Learning with non-IID Unlabeled Data Benefit from Self Supervision?”. image


pip install -r requirements.txt

Main Training Command

  1. Centralized SSL experiment python src/ --dataset=cifarssl --gpu=0 --iid=0 --dirichlet --dir_beta 0.02
  2. Decentralized SSL experiment python src/ --dataset=cifarssl --gpu=0 --iid=0 --dirichlet --dir_beta 0.02
  3. Decentralized SL experiment python src/ --dataset=cifarssl --gpu=0 --iid=0 --dirichlet --dir_beta 0.02
  4. Decentralized SL Representation experiment python src/ --dataset=cifarssl --gpu=0 --iid=0 --dirichlet --dir_beta 0.02
  5. Decentralized Feature Alignment SSL experiment python src/ --dataset=cifarssl --gpu=0 --iid=0 --dirichlet --dir_beta 0.02

Example Scripts on CIFAR

  1. Run Dirichlet non-i.i.d. SSL experiment with SimCLR scripts on CIFAR-10: bash scripts/noniid_script/
  2. Run ablation study with Simsiam bash scripts/noniid_script/
  3. Run distributed Training bash scripts/noniid_script/
  4. Run training with ray bash scripts/noniid_script/

ImageNet Experiment

  1. Generate ImageNet-100 dataset for smaller-scale experiments. python misc/ [PATH_TO_EXISTING_IMAGENET] [PATH_TO_CREATE_SUBSET]

  2. To launch as a batch job with two V100 gpus on a cluster bash scripts/imagenet_script/

  3. To train on ImageNet-100 bash scripts/imagenet_script/

  4. To train on Full ImageNet bash scripts/imagenet_script/

Transfer Learning: Object Detection / Segmentation

  1. Install Detectron2 and set up data folders following Detectron2’s datasets instruction.

  2. Convert pre-trained models to Detectron2 models:

python misc/ model.pth det_model.pkl
  1. Go to Detectron2’s folder, and run:
python tools/ --config-file /path/to/config/config.yaml MODEL.WEIGHTS /path/to/model/det_model.pkl

where config.yaml is the config file listed under the configs folder.

File Structure

├── ...
├── Dec-SSL
|   |── data 			# training data
|   |── src 			# source code
|   |   |── options 	# parameters and config
|   |   |── sampling 	# different sampling regimes for non-IIDness
|   |   |── update 	    # pipeline for each local client
|   |   |── models 	    # network architecture
|   |   |── *_main 	    # main training and testing scripts
|   |   └── ...
|   |── save 			# logged results
|   |── scripts 		# experiment scripts
|   |── misc 			# related scripts for finetuning 
└── ...


If you find Dec-SSL useful in your research, please consider citing:

	author    = {Lirui Wang, Kaiqing Zhang, Yunzhu Li, Yonglong Tian, and Russ Tedrake},
	title     = {Does Self-Supervised Learning Excel at Handling Decentralized and Non-IID Unlabeled Data?},
	booktitle = {arXiv:2210.10947},
	year      = {2022}


  1. FL
  2. SSL (1, 2, 3, 4, 5)




View Github