PySyft

PySyft decouples private data from model training, using Federated Learning, Differential Privacy, and Encrypted Computation (like Multi-Party Computation (MPC) and Homomorphic Encryption (HE)) within the main Deep Learning frameworks like PyTorch and TensorFlow.

Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data without first putting it all in one (central) place.

The Syft ecosystem seeks to change this system, allowing you to write software which can compute over information you do not own on machines you do not have (total) control over. This not only includes servers in the cloud, but also personal desktops, laptops, mobile phones, websites, and edge devices. Wherever your data wants to live in your ownership, the Syft ecosystem exists to help keep it there while allowing it to be used privately for computation.

Mono Repo ?

This repo contains multiple projects which work together, namely PySyft and PyGrid.
PyGrid will be added soon, in the mean time this is the directory structure.

OpenMined/PySyft
├── README.md   <-- You are here ?
└── packages
    ├── grid    <-- Coming to this Mono repo ?
    └── syft    <-- The Syft droids you are looking for ??

NOTE Changing the entire folder structure will likely result in some minor issues.
If you spot one please let us know or open a PR.

PySyft

PySyft is the centerpiece of the Syft ecosystem. It has two primary purposes. You can either use PySyft to perform two types of computation:

  1. Dynamic: Directly compute over data you cannot see.
  2. Static: Create static graphs of computation which can be deployed/scaled at a later date on different compute.

The PyGrid library serves as an API for the management and deployment of PySyft at scale. It also allows for you to extend PySyft for the purposes of Federated Learning on web, mobile, and edge devices using the following Syft worker libraries:

  • KotlinSyft (Android)
  • SwiftSyft (iOS)
  • syft.js (Javascript)
  • PySyft (Python, you can use PySyft itself as one of these "FL worker libraries")

However, the Syft ecosystem only focuses on consistent object serialization/deserialization, core abstractions, and algorithm design/execution across these languages. These libraries alone will not connect you with data in the real world. The Syft ecosystem is supported by the Grid ecosystem, which focuses on the deployment, scalability, and other additional concerns around running real-world systems to compute over and process data (such as data compliance web applications).

  • PySyft is the library that defines objects, abstractions, and algorithms.
  • PyGrid is the platform which lets you deploy them within a real institution.
  • PyGrid Admin is a UI which allows a data owner to manage their PyGrid deployment.

A more detailed explanation of PySyft can be found in the
white paper on Arxiv.

PySyft has also been explained in videos on YouTube:

Pre-Installation

PySyft is available on PyPI and Conda.

We recommend that you install PySyft within a virtual environment like
Conda,
due to its ease of use. If you are using Windows, we suggest installing
Anaconda and using the Anaconda
Prompt
to
work from the command line.

$ conda create -n pysyft python=3.9
$ conda activate pysyft
$ conda install jupyter notebook

Version Support

We support Linux, MacOS and Windows and the following Python and Torch versions.
Older versions may work, however we have stopped testing and supporting them.

Py / Torch 1.6 1.7 1.8
3.7
3.8
3.9

Installation

Pip

$ pip install syft

This will auto-install PyTorch and other dependencies as required to run the
examples and tutorials. For more information on building from source see the contribution guide here.

Examples

A comprehensive list of examples can be found here.

These tutorials cover a variety of Python libraries for data science and machine learning.

All the examples can be played with by launching a Jupyter Notebook and navigating to the examples folder.

$ jupyter notebook

GitHub

https://github.com/OpenMined/PySyft