Databot

High Performance Python Data driven programming framework for Web Crawler,ETL,Data pipeline work.

Data-driven programming framework
Paralleled in coroutines and ThreadPool
Type- and content-based route function

Installing

Install and update using pip:

pip install -U databot

What's data-driven programming?

All functions are connected by pipes (queues) and communicate by data.

When data come in, the function will be called and return the result.

Think about the pipeline operation in unix: ls|grep|sed.

Benefits:

. Decouple data and functionality

. Easy to reuse

Databot provides pipe and route. It makes data-driven programming and powerful data flow processes easier.

Databot is...

Simple

Databot is easy to use and maintain, does not need configuration files, and knows about asyncio and how to parallelize computation.

Here's one of the simple applications you can make:

Load the price of Bitoin every 2 seconds. Advantage price aggregator sample can be found here <https://github.com/kkyon/databot/tree/master/examples>.

.. code-block:: python

from databot.flow import Pipe, Timer
from databot.botframe import BotFrame
from databot.http.http import HttpLoader


def main():
    Pipe(


        Timer(delay=2),#send timer data to pipe every 2 sen
        "http://api.coindesk.com/v1/bpi/currentprice.json", #send url to pipe when timer trigger
        HttpLoader(),#read url and load http response
        lambda r:r.json['bpi']['USD']['rate_float'], #read http response and parse as json
        print, #print out

    )

    BotFrame.render('simple_bitcoin_price')
    BotFrame.run()

main()

flow graph
below is the flow graph generated by databot.
Fast
Nodes will be run in parallel, and they will perform well when processing stream data.
Visualization

With render function:
BotFrame.render('bitcoin_arbitrage')
databot will render the data flow network into a graphviz image.
https://github.com/kkyon/databot/blob/master/examples/bitcoin_arbitrage.png

Replay-able

With replay mode enabled:
config.replay_mode=True
when an exception is raised at step N, you don't need to run from setup 1 to N.
Databot will replay the data from nearest completed node, usually step N-1.
It will save a lot of time in the development phase.

High Performance Python Data driven programming framework for Web Crawler

Databot

Installing

What's data-driven programming?

. Decouple data and functionality

. Easy to reuse

Databot is...

GitHub

John

A Social Media Enumeration & Correlation Tool

Safe code refactoring for modern Python

Databot

Installing

What's data-driven programming?

. Decouple data and functionality

. Easy to reuse

Databot is...

GitHub

A Social Media Enumeration & Correlation Tool

Safe code refactoring for modern Python

You might also like...