An open source inference server for your machine learning models.

MLServer aims to provide an easy way to start serving your machine learning models through a REST and gRPC interface, fully compliant with KFServing's V2 Dataplane spec.

You can read more about the goals of this project on the inital design document.


You can install the mlserver package running:

pip install mlserver

Note that to use any of the optional inference runtimes, you'll need to install the relevant package. For example, to serve a scikit-learn model, you would need to install the mlserver-sklearn package:

pip install mlserver-sklearn

For further information on how to use MLServer, you can check any of the available examples.

Inference Runtimes

Inference runtimes allow you to define how your model should be used within MLServer. You can think of them as the backend glue between MLServer and your machine learning framework of choice. You can read more about inference runtimes in their documentation page.

Out of the box, MLServer comes with a set of pre-packaged runtimes which let you interact with a subset of common frameworks. This allows you to start serving models saved in these frameworks straight away.

Out of the box, MLServer provides support for:

Framework Supported Documentation
Scikit-Learn MLServer SKLearn
XGBoost MLServer XGBoost
Spark MLlib MLServer MLlib
LightGBM MLServer LightGBM
MLflow MLServer MLflow


To see MLServer in action, check out our full list of examples. You can find below a few selected examples showcasing how you can leverage MLServer to start serving your machine learning models.

Developer Guide


Both the main mlserver package and the inference runtimes packages try to follow the same versioning schema. To bump the version across all of them, you can use the ./hack/ script. For example:

./hack/ 0.2.0.dev1