rubrix

Rubrix is a free and open-source tool for exploring and iterating on data for artificial intelligence projects.

0shot_explore

Rubrix focuses on enabling novel, human in the loop workflows involving data scientists, subject matter experts and ML/data engineers.

rubrix_intro

With Rubrix, you can:

  • Monitor the predictions of deployed models.
  • Label data for starting up or evolving an existing project.
  • Iterate on ground-truth and predictions to debug, track and improve your data and models over time.
  • Build custom applications and dashboards on top of your model predictions.

We've tried to make working with Rubrix easy and fun, while keeping it scalable and flexible.

Rubrix is composed of:

  • a Python library to bridge data and models, which you can install via pip.
  • a web application to explore and label data, which you can launch using Docker or directly with Python.

This is an example of Rubrix UI annotation mode:

rubrix_annotation_mode

? For more information, visit the documentation or if you want to get started, keep reading.

Get started

To get started you need to follow three steps:

  1. Install the Python client
  2. Launch the web app
  3. Start logging data

1. Install the Python client

You can install the Python client with pip:

pip install rubrix

2. Launch the webapp

There are two ways to launch the webapp:

  • Using docker-compose (recommended).
  • Executing the server code manually

Using docker-compose (recommended)

Create a folder:

mkdir rubrix && cd rubrix

and launch the docker-contained web app with the following command:

wget -O docker-compose.yml https://raw.githubusercontent.com/recognai/rubrix/master/docker-compose.yaml && docker-compose up

This is the recommended way because it automatically includes an
Elasticsearch instance, Rubrix's main persistent layer.

Executing the server code manually

When executing the server code manually you need to provide an Elasticsearch instance yourself.

  1. First you need to install
    Elasticsearch
    (we recommend version 7.10) and launch an Elasticsearch instance.
    For MacOS and Windows there are
    Homebrew formulae and a
    msi package, respectively.
  2. Install the Rubrix Python library together with its server dependencies:
pip install rubrix[server]
  1. Launch a local instance of the Rubrix web app
python -m rubrix.server

By default, the Rubrix server will look for your Elasticsearch endpoint at http://localhost:9200.
If you want to customize this, you can set the ELASTICSEARCH environment variable pointing to your endpoint.

3. Start logging data

The following code will log one record into the example-dataset dataset:

import rubrix as rb

rb.log(
    rb.TextClassificationRecord(inputs="my first rubrix example"),
    name='example-dataset'
)

BulkResponse(dataset='example-dataset', processed=1, failed=0)

If you go to your Rubrix app at http://localhost:6900/, you should see your first dataset.

Congratulations! You are ready to start working with Rubrix with your own data.

GitHub

https://github.com/recognai/rubrix