Equivariant CNNs for the sphere and SO(3) implemented in PyTorch
This library contains a PyTorch implementation of the rotation equivariant CNNs for spherical signals (e.g. omnidirectional images, signals on the globe) as presented in . Equivariant networks for the plane are available here.
- PyTorch: http://pytorch.org/ (>= 0.4.0)
- cupy: https://github.com/cupy/cupy
- lie_learn: https://github.com/AMLab-Amsterdam/lie_learn
- pynvrtc: https://github.com/NVIDIA/pynvrtc
(commands to install all the dependencies on a new conda environment)
conda create --name cuda9 python=3.6 conda activate cuda9 # s2cnn deps #conda install pytorch torchvision cuda90 -c pytorch # get correct command line at http://pytorch.org/ conda install -c anaconda cupy pip install pynvrtc joblib # lie_learn deps conda install -c anaconda cython conda install -c anaconda requests # shrec17 example dep conda install -c anaconda scipy conda install -c conda-forge rtree shapely conda install -c conda-forge pyembree pip install "trimesh[easy]"
To install, run
$ python setup.py install
Please have a look at the examples.
Please cite  in your work when using this library in your experiments.
Design choices for Spherical CNN Architectures
Spherical CNNs come with different choices of grids and grid hyperparameters which are on the first look not obviously related to those of conventional CNNs.
so3_near_identity_grid are the preferred choices since they correspond to spatially localized kernels, defined at the north pole and rotated over the sphere via the action of SO(3).
so3_equatorial_grid define line-like (or ring-like) kernels around the equator.
To clarify the possible parameter choices for
Adapts the size of the kernel as angle measured from the north pole.
Conventional CNNs on flat space usually use a fixed kernel size but pool the signal spatially.
This spatial pooling gives the kernels in later layers an effectively increased field of view.
One can emulate a pooling by a factor of 2 in spherical CNNs by decreasing the signal bandwidth by 2 and increasing
max_beta by 2.
Number of rings of the kernel around the equator, equally spaced in
n_beta=1 corresponds to a small 3x3 kernel in
conv2d since in both cases the resulting kernel consists of one central pixel and one ring around the center.
Gives the number of learned parameters of the rings around the pole.
These values are per default equally spaced on the azimuth.
A sensible number of values depends on the bandwidth and
max_beta since a higher resolution or spatial extent allow to sample more fine kernels without producing aliased results.
In practice this value is typically set to a constant, low value like 6 or 8.
A reduced bandwidth of the signal is thereby counteracted by an increased
max_beta to emulate spatial pooling.
so3_near_identity_grid has two additional parameters
SO(3) can be seen as a (principal) fiber bundle SO(3)→S² with the sphere S² as base space and fiber SO(2) attached to each point.
The additional parameters control the grid on the fiber in the following way:
The kernel spans over the fiber SO(2) between γ∈[0,
The fiber SO(2) encodes the kernel responses for every sampled orientation at a given position on the sphere.
max_gamma≨2π results in the kernel not seeing the responses of all kernel orientations simultaneously and is in general unfavored.
Steerable CNNs  usually always use
Number of learned parameters on the fiber.
Typically set equal to
n_alpha, i.e. to a low value like 6 or 8.
See the deep model of the MNIST example for an example of how to adapt these parameters over layers.
For questions and comments, feel free to contact us: geiger.mario (gmail), taco.cohen (gmail), jonas (argmin.xyz).