/ Natural Language Processing

A programming library for clustering text

A programming library for clustering text

Carrot2

Carrot2 is a programming library for clustering text. It can automatically discover groups of related documents and label them with short key terms or phrases.

Carrot2 can turn, for example, search result titles and snippets into groups like these:

Carrot2

Installation

Carrot2 is a software component and typically integrates with other software
as a library dependency (see the API documentation available with each release).

Binary releases are published on GitHub and they
ship with a HTTP/JSON REST API service called the DCS
(document clustering server) for integration with other languages.

Integration with document retrieval services is possible
via Apache Solr plugin
and Elasticsearch plugin.

Documentation

GitHub

Comments