dask-rasterio

dask-rasterio provides some methods for reading and writing rasters in parallel using Rasterio and Dask arrays.

Usage

Read a multiband raster

>>> from dask_rasterio import read_raster

>>> array = read_raster('tests/data/RGB.byte.tif')
>>> array
dask.array<stack, shape=(3, 718, 791), dtype=uint8, chunksize=(1, 3, 791)>

>>> array.mean()
dask.array<mean_agg-aggregate, shape=(), dtype=float64, chunksize=()>
>>> array.mean().compute()
40.858976977533935

Read a single band from a raster

>>> from dask_rasterio import read_raster

>>> array = read_raster('tests/data/RGB.byte.tif', band=3)
>>> array
dask.array<raster, shape=(718, 791), dtype=uint8, chunksize=(3, 791)>

Write a singleband or multiband raster

>>> from dask_rasterio import read_raster, write_raster

>>> array = read_raster('tests/data/RGB.byte.tif')

>>> new_array = array & (array > 100)
>>> new_array
dask.array<and_, shape=(3, 718, 791), dtype=uint8, chunksize=(1, 3, 791)>

>>> prof = ... # reuse profile from tests/data/RGB.byte.tif...
>>> write_raster('processed_image.tif', new_array, **prof)

Chunk size

Both read_raster and write_raster accept a block_size argument that
acts as a multiplier to the block size of rasters. The default value is 1,
which means the dask array chunk size will be the same as the block size of
the raster file. You will have to adjust this value depending on the
specification of your machine (how much memory do you have, and the block
size of the raster).

Install

Install with pip:

pip install dask-rasterio

Development

This project is managed by Poetry. If
you do not have it installed, please refer to
Poetry instructions.

Now, clone the repository and run poetry install. This will create a virtual
environment and install all required packages there.

Run poetry run pytest to run all tests.

Run poetry build to build package on dist/.

GitHub

https://github.com/dymaxionlabs/dask-rasterio