The accompanying codes reproduce all figures and statistics presented in “Controlling for multiple covariates” by Mark Tygert. This repository also provides the LaTeX and BibTeX sources required for replicating the paper.
Be sure to
pip install hilbertcurve prior to running any of this software (the codes depend on HilbertCurve). Also be sure to
gunzip codes/cup98lrn.txt prior to running
The main files in the repository are the following:
tex/multidim.pdf PDF version of the paper
tex/multidim.tex LaTeX source for the paper
tex/multidim.bib BibTeX source for the paper
tex/partition.pdf Graphics for Subsection 2.3 of the paper
codes/acs.py Python script for processing the American Community Survey
codes/psam_h06.csv Microdata from the 2019 American Community Survey of the U.S. Census Bureau
codes/kddcup98.py Python script for processing the KDD Cup 1998 data
codes/cup98lrn.txt.gz Data from the 1998 KDD Cup
codes/synthetic.py Python script for generating and processing synthetic examples
codes/hilbert.pdf Plot of an approximation with 255 line segments to the Hilbert curve in 2D
codes/disjoint.py Functions for plotting differences between two subpops. with disjoint scores (redistributed from the GitHub repo fbcddisgraph)
codes/disjoint.py Functions for plotting differences of a subpop. from the full population (redistributed from the GitHub repo )
codes/subpop_weighted.py Functions for plotting differences of a subpop. from the full pop. with weights (redistributed from the GitHub repo fbcdgraph)
Regenerating all the figures requires running in the directory
synthetic.py; issue the commands
pip install hilbertcurve
python acs.py --var 'MV'
python acs.py --var 'NOC'
python acs.py --var 'MV+NOC'
python acs.py --var 'NOC+MV'
This metamulti software is licensed under the (MIT-type) copyright LICENSE file in the root directory of this source tree.