bart 2a7c9cd73c Added FinePruning defence. | 1 year ago | |
---|---|---|
ExpCleanLabel | 1 year ago | |
ExpInBounds | 1 year ago | |
ExpTriggerPosition | 1 year ago | |
ExpTriggerSize | 1 year ago | |
FTtransformer | 1 year ago | |
Notebooks | 1 year ago | |
SAINT | 1 year ago | |
data | 1 year ago | |
.gitignore | 1 year ago | |
README.md | 1 year ago | |
requirements.txt | 1 year ago |
Code repository for Master thesis on backdoor attacks on transformer-based DNNs for tabular data.
tabular-backdoors # Project directory
├── data # Contains datasets and preprocessing notebooks
├── ExpCleanLabel # Experiment code for Clean Label Attack
├── ExpInBounds # Experiment code for In Bounds Trigger
├── ExpTriggerPosition # Experiment code for Trigger Position based on feature importance
├── ExpTriggerSize # Experiment code for Trigger Size
├── SAINT # SAINT model code
├── FTtransformer # FT-Transformer model code
└── Notebooks # Other (smaller or parts of) experiments in the form of notebooks
├── FeatureImportances # Notebooks to calculate feature importance scores and rankings
└── Defences # Notebooks on defences against our attacks
virtualenv tabularbackdoor
source tabularbackdoor/bin/activate
pip install -r requirements.txt
# To run the notebooks you also need:
pip install notebook
accepted_2007_to_2018Q4.csv
from https://www.kaggle.com/datasets/wordsforthewise/lending-club and place in data/LOAN/
LCDataDictionary.xlsx
from https://www.kaggle.com/datasets/adarshsng/lending-club-loan-data-csv?select=LCDataDictionary.xlsx and place in data/LOAN/
HIGGS.csv.gz
from https://archive.ics.uci.edu/ml/datasets/HIGGS and extract HIGGS.csv
to data/HIGGS
data/preprocess
to generate the .pkl
files containing the datasets for the experimentsRun the shell script in any of the Exp*
folders from the project root with the Python filename (without extension) as argument. Output will be logged to the output folder.
cuda:x
string (located somewhere on top) in each .py
file.Example:
bash ExpTriggerSize/run_experiment.sh TabNet_CovType_1F_OOB
To live view the log of a running experiment, use tail -f
with the logfile as argument in a new terminal:
tail -f output/triggersize/TabNet_CovType_1F_OOB.log
Output logs are found in the output/
folder. All logs end with a section EASY COPY PASTE RESULTS:
where you can copy the resulting lists containing the ASR
and BA
for each run.
See the Notebooks/
folder for other (smaller or parts of) experiments in the form of notebooks. To run the defences, you must first run the appropiate CreateModel
Notebook to create a backdoored model and dataset which can then be analyzed with the other Notebooks. For Fine-Pruning defence, there is a dedicated subfolder in the Notebooks/Defences
folder with notebooks to train, prune and finetune FTT.