A part of HyRiver software stack for accessing hydrology data through web services


PyGeoHydro is a part of HyRiver software stack that is designed to aid in watershed analysis through web services. This package provides access to some of the public web services that offer geospatial hydrology data. It has three main modules: pygeohydro, plot, and helpers.

The pygeohydro module can pull data from the following web services:

  • NWIS for daily mean streamflow observations,
  • NID for accessing the National Inventory of Dams in the US,
  • HCDN 2009 for identifying sites where human activity affects the natural flow of the watercourse,
  • NLCD 2016 for land cover/land use, imperviousness, and canopy data,
  • SSEBop for daily actual evapotranspiration, for both single pixel and gridded data.

Also, it has two other functions:

  • interactive_map: Interactive map for exploring NWIS stations within a bounding box.
  • cover_statistics: Categorical statistics of land use/land cover data.

The plot module includes two main functions:

  • signatures: Hydrologic signature graphs.
  • cover_legends: Official NLCD land cover legends for plotting a land cover dataset.

The helpers module includes:

  • nlcd_helper: A roughness coefficients lookup table for each land cover type which is useful for overland flow routing among other applications.
  • nwis_error: A dataframe for finding information about NWIS requests' errors.

Moreover, requests for additional databases and functionalities can be submitted via issue tracker.

You can find some example notebooks here.

You can also try using PyGeoHydro without installing it on you system by clicking on the binder badge. A Jupyter Lab instance with the HyRiver stack pre-installed will be launched in your web browser and you can start coding!

Please note that since this project is in early development stages, while the provided functionalities should be stable, changes in APIs are possible in new releases. But we appreciate it if you give this project a try and provide feedback. Contributions are most welcome.

Moreover, requests for additional functionalities can be submitted via issue tracker.


You can install PyGeoHydro using pip after installing libgdal on your system (for example, in Ubuntu run sudo apt install libgdal-dev). Moreover, PyGeoHydro has an optional dependency for using persistent caching, requests-cache. We highly recommend to install this package as it can significantly speedup send/receive queries. You don't have to change anything in your code, since PyGeoHydro under-the-hood looks for requests-cache and if available, it will automatically use persistent caching:

$ pip install pygeohydro

Alternatively, PyGeoHydro can be installed from the conda-forge repository using Conda:

$ conda install -c conda-forge pygeohydro

Quick start

We can explore the available NWIS stations within a bounding box using interactive_map function. It returns an interactive map and by clicking on an station some of the most important properties of stations are shown.

import pygeohydro as gh

bbox = (-69.5, 45, -69, 45.5)


We can select all the stations within this boundary box that have daily mean streamflow data from 2000-01-01 to 2010-12-31:

from pygeohydro import NWIS

nwis = NWIS()
query = {
    "hasDataTypeCd": "dv",
    "outputDataTypeCd": "dv",
info_box = nwis.get_info(query)
dates = ("2000-01-01", "2010-12-31")
stations = info_box[
    (info_box.begin_date <= dates[0]) & (info_box.end_date >= dates[1])

Then, we can get the streamflow data in mm/day (by default the data are in cms) and plot them:

from pygeohydro import plot

qobs = nwis.get_streamflow(stations, dates, mmd=True)

Moreover, we can get land use/land cove data using nlcd function, percentages of land cover types using cover_statistics, and actual ET with ssebopeta_bygeom:

from pynhd import NLDI

geometry = NLDI().get_basins("01031500").geometry[0]
lulc = gh.nlcd(geometry, 100, years={"impervious": None, "cover": 2016, "canopy": None})
stats = gh.cover_statistics(lulc.cover)
eta = gh.ssebopeta_bygeom(geometry, dates=("2005-10-01", "2005-10-05"))



Additionally, we can pull all the US dams data using get_nid and get_nid_codes:

nid = gh.get_nid()
codes = gh.get_nid_codes()