Confusion Training

Official Implementation for Fight Poison with Poison: Detecting Backdoor Poison Samples via Decoupling Benign Correlations

Attacks

See poison_tool_box/.

Adaptive

  • adaptive_blend: adaptive attack with a single blending trigger
  • adaptive_k: adaptive attack with k=4 different triggers

Others

  • badnet: basic attack with badnet patch trigger
  • blend: basic attack with a single blending trigger
  • dynamic
  • clean_label
  • SIG
  • TaCT: source specific attack

Cleansers

Ours

  • confusion_training.py
  • run “poison_cleanser_iter.py” to launch

Others

See other_cleanses/.

  • SCAn
  • AC: activation clustering
  • SS: spectral signature
  • SPECTRE
  • Strip

Visualization

See visualize.py.

  • tsne
  • pca
  • oracle

Quick Start

To launch and defend an Adaptive-Blend attack:

# Create a clean set
python create_clean_set.py -dataset=cifar10

# Create a poisoned training set
python create_poisoned_set.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.005 -cover_rate=0.005

# Train on the poisoned training set
python train_on_poisoned_set.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.005 -cover_rate=0.005
python train_on_poisoned_set.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.005 -cover_rate=0.005 -no_aug

# Visualize
## $METHOD = ['pca', 'tsne', 'oracle']
python visualize.py -method=$METHOD -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.005 -cover_rate=0.005

# **Cleanse with Confusion Training**
python poison_cleander_iter.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.005 -cover_rate=0.005

# Cleanse with other cleansers
## $CLEANSER = ['SCAn', 'AC', 'SS', 'Strip', 'SPECTRE']
python other_cleanser.py -cleanser=$CLEANSER -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.005 -cover_rate=0.005

# Retrain on cleansed set
## $CLEANSER = ['CT', 'SCAn', 'AC', 'SS', 'Strip', 'SPECTRE']
python train_on_cleansed_set.py -cleanser=$CLEANSER -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.005 -cover_rate=0.005

Notice:

Poisoning attacks we evaluate in our papers:

# No Poison
python create_poisoned_set.py -dataset=cifar10 -poison_type=none -poison_rate=0
# BadNet
python create_poisoned_set.py -dataset=cifar10 -poison_type=badnet -poison_rate=0.01
# Blend
python create_poisoned_set.py -dataset=cifar10 -poison_type=blend -poison_rate=0.01
# Blend
python create_poisoned_set.py -dataset=cifar10 -poison_type=dynamic -poison_rate=0.01
# Clean Label
python create_poisoned_set.py -dataset=cifar10 -poison_type=clean_label -poison_rate=0.005
# SIG
python create_poisoned_set.py -dataset=cifar10 -poison_type=SIG -poison_rate=0.02
# TaCT
python create_poisoned_set.py -dataset=cifar10 -poison_type=TaCT -poison_rate=0.02 -cover_rate=0.01
# Adaptive Blend
python create_poisoned_set.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.005 -cover_rate=0.005
# Adaptive K
python create_poisoned_set.py -dataset=cifar10 -poison_type=adaptive_k -poison_rate=0.005 -cover_rate=0.01

You can also:

  • specify details on the trigger (for blend, clean_label, adaptive_blend and TaCT attacks) via
    • -alpha=$ALPHA, the opacity of the trigger.
    • -trigger=$TRIGGER_NAME, where $TRIGGER_NAME is the name of a 32×32 trigger mark image in triggers/. If another image named mask_$TRIGGER_NAME also exists in triggers/, it will be used as the trigger mask. Otherwise by default, all black pixels of the trigger mark are not applied.
  • train a vanilla model via
    python train_vanilla.py
  • test a trained model via

    python test_model.py -dataset=cifar10 -poison_type=adaptive_blend -poison_rate=0.005 -cover_rate=0.005
    # other options include: -no_aug, -cleanser=$CLEANSER, -model_path=$MODEL_PATH, see our code for details
  • enforce a fixed running seed via -seed=$SEED option
  • change dataset to GTSRB via -dataset=gtsrb option
  • see more configurations in config.py

GitHub

View Github