tskit

Succinct tree sequences are a highly efficient way of storing a set of related DNA sequences by encoding their ancestral history as a set of correlated trees along the genome. The tree sequence format is output by a number of software libraries and programs (such as msprime, SLiM, fwdpp, and tsinfer) that either simulate or infer the evolutionary history of genetic sequences.

The tskit library provides the underlying functionality used to load, examine, and manipulate tree sequences, including efficient methods for calculating genetic statistics. It often forms part of an installation of other software packages such as those listed above. Please see the documentation for further details, which includes installation instructions.

tskit has both a Python and C API

Python API

Most users of tskit will use the python API as it provides a convenient, high-level API to access, analyse and create tree sequences.

C API

The tskit C API provides comprehensive, low-level methods for manipulating and processing tree-sequences. Written to the C99 standard and fully thread-safe, it can be used with either C or C++.

GitHub