Karate Club
Karate Club is an unsupervised machine learning extension library for NetworkX.
Karate Club consists of state-of-the-art methods to do unsupervised learning on graph structured data. To put it simply it is a Swiss Army knife for small-scale graph mining research. First, it provides network embedding techniques at the node and graph level. Second, it includes a variety of overlapping and non-overlapping community detection methods. Implemented methods cover a wide range of network science (NetSci, Complenet), data mining (ICDM, CIKM, KDD), artificial intelligence (AAAI, IJCAI) and machine learning (NeurIPS, ICML, ICLR) conferences, workshops, and pieces from prominent journals.
The newly introduced graph classification datasets are available at SNAP, TUD Graph Kernel Datasets, and GraphLearning.io.
Citing
If you find Karate Club and the new datasets useful in your research, please consider citing the following paper:
@inproceedings{karateclub,
title = {{Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs}},
author = {Benedek Rozemberczki and Oliver Kiss and Rik Sarkar},
year = {2020},
pages = {3125–3132},
booktitle = {Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM '20)},
organization = {ACM},
}
A simple example
Karate Club makes the use of modern community detection techniques quite easy (see here for the accompanying tutorial). For example, this is all it takes to use on a Watts-Strogatz graph Ego-splitting:
import networkx as nx
from karateclub import EgoNetSplitter
g = nx.newman_watts_strogatz_graph(1000, 20, 0.05)
splitter = EgoNetSplitter(1.0)
splitter.fit(g)
print(splitter.get_memberships())
Installation
Karate Club can be installed with the following pip command.
$ pip install karateclub
As we create new releases frequently, upgrading the package casually might be beneficial.
$ pip install karateclub --upgrade
Running examples
As part of the documentation we provide a number of use cases to show how the clusterings and embeddings can be utilized for downstream learning. These can accessed here with detailed explanations.
Besides the case studies we provide synthetic examples for each model. These can be tried out by running the example scripts. In order to run one of the examples, the Graph2Vec snippet:
$ cd examples/whole_graph_embedding/
$ python graph2vec_example.py
Running tests
$ python setup.py test