The Rich Get Richer:
Disparate Impact of Semi-Supervised Learning
Preprocess file of the dataset used in implicit sub-populations:
(Demographic groups: race and gender)
The following code will pre-process the jigsaw dataset and return train/test dataset files including demographic groups information.
Download the jigsaw dataset: identity_individual_annotations.csv from
Implementation of SSL methods
Please follow the official implementations of MixMatch, MixText, and UDA.