A terminal-based cluster labeling tool that allows you to explore text data interactively and label clusters

Sep 10, 2021 1 min read

CLabel

CLabel is a terminal-based cluster labeling tool that allows you to explore text data interactively and label clusters based on reviewing that data.

Install & Quickstart

pip install clabel

Type clabel to run. Everything should happen in the terminal from there.

Currently clabel can only import CSV files. It expects two columns to be in your csv: a column of text (string) and a column of cluster labels (int). You’ll identify these the first time you import a dataset.

The workflow is:

Pick a cluster to view examples. You’ll view this through a pager so you can page through examples.
Come up with a name for that cluster (Declare Name)
Repeat 1 & 2 until all your clusters have names.

You can persist any cluster labels to a json file when you exit, so you don’t have to complete labeling in one session. Then, you can load those labels in the next time you start clabel by selecting that json file and continue labeling.

Screenshots

GitHub

https://github.com/pmbaumgartner/clabel

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.

A terminal-based cluster labeling tool that allows you to explore text data interactively and label clusters

CLabel

Install & Quickstart

Screenshots

GitHub

John

An asyncio Python wrapper around the Discord API, forked off of Rapptz's Discord.py

httpstat visualizes curl(1) statistics in a way of beauty and clarity

CLabel

Install & Quickstart

Screenshots

GitHub

An asyncio Python wrapper around the Discord API, forked off of Rapptz's Discord.py

httpstat visualizes curl(1) statistics in a way of beauty and clarity

You might also like...