ByteCog is a python script that aims to help security researchers and others a like to classify malicious software compared to other samples, depending on what the unknown file(s) is/are being tested against. This script can be extended to use a machine learning model to classify malware if you wanted to do so. ByteCog uses multiple methods of analyzing and classifying samples given to it, such as using Shannon Entropy to give a visual aspect for the researchers to look at while analyzing the code and finding possible readable code/text in a sample. ByteCog also uses Hausdorff Distance to calculate a 'raw similarity' value based on the difference in the entropy graphs of both samples, and finally ByteCog uses Jaro-Winkler Distance to calculate the 'true similarity' since the Hausdorff Distance will in most cases return a very high value if the sample is mostly the same entropy wise, so the Jaro-Winkler Distance is used to 'adjust' the simliarity value for this case of a sample.
- A python installation above 3.5+, which you can download from the official python website here.
Clone this repository to your local machine by following these instructions layed out here
Then proceed to download the dependencies file by running the following line in your console window
pip install requirements.txt
====================================================== | ____ __ ______ | | / __ ) __ __ / /_ ___ / ____/____ ____ | | / __ |/ / / // __// _ \ / / / __ \ / __ \ | | / /_/ // /_/ // /_ / __// /___ / /_/ // /_/ / | | /_____/ \__, / \__/ \___/ \____/ \____/ \__, / | | /____/ /____/ | | | | Version: 0.4 | | Author: IlluminatiFish | ====================================================== usage: bytecog.py [-h] -k KNOWN -u UNKNOWN -i IDENTIFIER -v VISUAL Determine whether an unknown provided sample is similar to a known sample optional arguments: -h, --help show this help message and exit -k KNOWN, --known KNOWN The file path to the known sample -u UNKNOWN, --unknown UNKNOWN The file path to the unknown sample -i IDENTIFIER, --identifier IDENTIFIER The antivirus identifier of the known file -v VISUAL, --visual VISUAL If you want to show a visual representation of the file entropy
Features & Use Cases
- Calculates sample similarity
- Generates chunked entropy graph
- Able to possibly detect malicious and benign software samples
Chunked Entropy Graph
Output of ByteCog
ByteCog Log File