Notebooks are hard to maintain. Teams often prototype projects in notebooks, but maintaining them is an error-prone process that slows progress down. Ploomber overcomes the challenges of working with
.ipynb files allowing teams to develop collaborative, production-ready pipelines using JupyterLab or any text editor.
- Scripts as notebooks. Open
.pyfiles as notebooks, then execute them from the terminal and generate an output notebook to review results.
- Dependency resolution. Quickly build a DAG by referring to previous tasks in your code; Ploomber infers execution order and orchestrates execution.
- Incremental builds. Speed up iterations by skipping tasks whose source code hasn’t changed since the last execution.
- Production-ready. Deploy to Kubernetes (via Argo Workflows), Airflow, and AWS Batch without code changes.
- Parallelization. Run independent tasks in parallel.
- Testing. Import pipelines in any testing frameworks and test them with any CI service (e.g. GitHub Actions).
- Flexible. Use Jupyter notebooks, Python scripts, R scripts, SQL scripts, Python functions, or a combination of them as pipeline tasks. Write pipelines using a
pipeline.yamlfile or with Python.
- Examples (Machine Learning pipeline, ETL, among others)
- Guest blog post on the official Jupyter blog
- Comparison with other tools
- JupyterCon 2020 talk
- Argo Community Meeting talk
- Pangeo Showcase talk (AWS Batch demo)
Compatible with Python 3.6 and higher.
pip install ploomber
conda install ploomber -c conda-forge
Use Binder to try out Ploomber without setting up an environment:
Or run an example locally:
# ML pipeline example ploomber examples --name ml-basic cd ml-basic # if using pip pip install -r requirements.txt # if using conda conda env create --file environment.yml conda activate ml-basic # run pipeline ploomber build
Pipeline output saved in the
output/ folder. Check out the pipeline definition in the
To get a list of examples, run
Click here to go to our examples repository.