Fast MST Algorithm

Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Usage

The implementation finds Maximum Spanning Tree. If you want minimum spanning tree instead you can provide negative weights. The implementation contains three components:

  • Tarjan’s algorithm for finding unconstrained MST
  • Reweighting meta-algorithm for constraining MST to have only one ROOT edge (see reference below)
  • ArcMax optimization for speed improvements on easy inputs

Everything relevant for MST dependency parsing can be accessed trough fast_parse function as shown here:

>> from mst import fast_parse
>>> import numpy as np
>>> example_weights = np.random.rand(10, 10)
>>> fast_parse(example_weights, one_root=True)
array([-1, 3, 5, 6, 9, 1, 0, 1, 9, 3])
>>> fast_parse(example_weights, one_root=False)
array([-1, 3, 5, 0, 9, 1, 0, 1, 9, 3])
“>

>>> from mst import fast_parse
>>> import numpy as np
>>> example_weights = np.random.rand(10, 10)
>>> fast_parse(example_weights, one_root=True)
array([-1,  3,  5,  6,  9,  1,  0,  1,  9,  3])
>>> fast_parse(example_weights, one_root=False)
array([-1,  3,  5,  0,  9,  1,  0,  1,  9,  3])

Input weight matrix weight [i, j] is interpreted the weight of arc going from j to i (j is the head while i is the dependent). Token 0 is treated at the root note of the MST (it doesn’t have an incoming arc).

References

The algorithms and their performance are presented in:

A Root of a Problem: Optimizing Single-Root Dependency Parsing
Miloš Stanojević and Shay B. Cohen
EMNLP 2021

GitHub

https://github.com/stanojevic/Fast-MST-Algorithm