/ Data Analysis

A flexible network data analysis framework

A flexible network data analysis framework

nfstream

nfstream is a Python package providing fast, flexible, and expressive data structures designed to make working with online or offline network data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world network data analysis in Python. Additionally, it has the broader goal of becoming a common network data processing framework for researchers providing data reproducibility across experiments.

Main Features

  • Performance: nfstream is designed to be fast (x10 faster with pypy3 support) with a small CPU and memory footprint.
  • Layer-7 visibility: nfstream deep packet inspection engine is based on [nDPI][ndpi]. It allows nfstream to perform [reliable][reliable] encrypted applications identification and metadata extraction (e.g. TLS, QUIC, TOR, HTTP, SSH, DNS).
  • Flexibility: add a flow feature in 2 lines as an [NFPlugin][nfplugin].
  • Machine Learning oriented: add your trained model as an [NFPlugin][nfplugin].

How to use it?

  • Dealing with a big pcap file and just want to aggregate it as network flows? nfstream make this path easier in few lines:
   from nfstream import NFStreamer
   my_awesome_streamer = NFStreamer(source="facebook.pcap") # or network interface (source="eth0")
   for flow in my_awesome_streamer:
       print(flow)  # print it, append to pandas Dataframe or whatever you want :)!
    NFEntry(
        id=0,
        first_seen=1472393122365,
        last_seen=1472393123665,
        version=4,
        src_port=52066,
        dst_port=443,
        protocol=6,
        vlan_id=0,
        src_ip='192.168.43.18',
        dst_ip='66.220.156.68',
        total_packets=19,
        total_bytes=5745,
        duration=1300,
        src2dst_packets=9,
        src2dst_bytes=1345,
        dst2src_packets=10,
        dst2src_bytes=4400,
        expiration_id=0,
        master_protocol=91,
        app_protocol=119,
        application_name='TLS.Facebook',
        category_name='SocialNetwork',
        client_info='facebook.com',
        server_info='*.facebook.com',
        j3a_client='bfcc1a3891601edb4f137ab7ab25b840',
        j3a_server='2d1eb5817ece335c24904f516ad5da12'
    )
  • From pcap to Pandas DataFrame?
    import pandas as pd	
    streamer_awesome = NFStreamer(source='devil.pcap')
    data = []
    for flow in streamer_awesome:
       data.append(flow.to_namedtuple())
    my_df = pd.DataFrame(data=data)
    my_df.head(5) # Enjoy!
  • Didn't find a specific flow feature? add a plugin to nfstream in few lines:
    from nfstream import NFPlugin

    class my_awesome_plugin(NFPlugin):
        def on_update(self, obs, entry):
            if obs.length >= 666:
                entry.my_awesome_plugin += 1
		
   streamer_awesome = NFStreamer(source='devil.pcap', plugins=[my_awesome_plugin()])
   for flow in streamer_awesome:
      print(flow.my_awesome_plugin) # see your dynamically created metric in generated flows
  • More example and details are provided on the official [documentation][documentation].

Prerequisites

    apt-get install libpcap-dev

Installation

Using pip

Binary installers for the latest released version are available:

    pip3 install nfstream

Build from source

If you want to build nfstream on your local machine:

    git clone https://github.com/aouinizied/nfstream.git
    cd nfstream
    python3 setup.py install

GitHub

Comments