Authored by: Tim Blazytko
Adapted by: mcdulltii
Automatically detect obfuscated code and other state machines
Scripts to automatically detect obfuscated code and state machines in binaries.
Implementation is based on IDA 7.4+ (Python3). Check out the following blog posts for more information on the Binary Ninja implementation:
- Automated Detection of Control-flow Flattening
- Automated Detection of Obfuscated Code
- Referenced Repository
Due to the recursive nature of plotting a dominator tree of every found function within the binary, the implementation and runtime overhead is expensive, though threading has been implemented.
MAX_FUNCTIONS = 50 MAX_NODES = 50 # --- snipped --- if sum([1 for _ in idautils.Functions()]) > MAX_FUNCTIONS: detect.partial_heur() else: detect.all_heur() # --- snipped --- if sum([1 for _ in FlowChart(get_func(ea))]) > MAX_NODES: pass
For more details on
all_heur() calls all heuristic functions on the binary, then prints an output of the heuristics of all functions within the binary.
partial_heur() calls all heuristic functions on the binary, then prints an output of the heuristics of the top 10% (or bounded by
MAX_FUNCTIONS) functions within the binary.
Instruction overlapping heuristic algorithm makes use of mcsema disassembly code to follow jmp and call instructions for better coverage.
Since the script uses the IDA API, any functions that are missed by IDA will likely not be detected.
obfDetect directory and
obfDetect.py into the IDA Plugins directory.
When IDA has successfully finished loading a binary, the script will print out its banner into the IDC/Python console. If not, the script can be re-loaded using
alt-E and selecting it within the plugin dropdown.
The script can be run via the
File toolbar as shown below. Alternatively,
- A small binary with 2 scanned functions
- Resilience test using a large binary obfuscated using O-LLVM
- Instruction overlapping heuristic detection
- GUI Implementation branch