Malware-Related Sentence Classification

This repo contains the code for the ICTAI 2021 paper “Enrichment of Features for Malware-Related Sentence Classification using External Knowledge“.

Installation

Installation from the source. Python’s virtual or Conda environments are recommended.

git clone https://github.com/chaumng/malware_related_sentence_classification.git
cd malware_related_sentence_classification
pip install -r requirements.txt

This repo is tested on Python 3.7.

Classification and Evaluation

Preprocess data

python preprocess_data.py

Parameter searching: Classify and evaluate

In this repo, we already provided the GAT weak labels in a file.
To perform parameter searching, run the following command. The default value is to perform the second grid search. You can change the value of the argument param_grid_setting to “first_grid_search” perform the first grid search, or to “best_setting” to run only the best setting.

python svm_param_search.py --param_grid_setting second_grid_search

Citation

If you find this paper or this code useful, please cite this paper:

@inproceedings{chaunguyen_et_al_2021,
  title={Enrichment of Features for Malware-Related Sentence Classification using External Knowledge},
  author={Nguyen, Chau and Tran, Vu and Nguyen, Le Minh},
  booktitle={Proceedings of the 33rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI)},
  year={2021},
  organization={IEEE},
}

GitHub

View Github