A gentle introduction to Steerable Neural Networks (part 1)

Author:Murphy  |  View: 28219  |  Time: 2025-03-23 11:59:51

Introduction

Geometrical Deep Learning, as a branch of Deep Learning, aims to extend traditional AI frameworks such as Convolutional Neural Networks to process 3D or 2D geometric objects represented as graphs, manifolds, or point clouds. By incorporating geometric relationships and spatial dependencies directly into the learning framework, geometrical deep learning harnesses the inherent structural properties of data to eliminate the requirement for memory-intensive data augmentation techniques. For all these reasons, Geometrical Deep Learning can be seen as valuable tool for tackling complex data scenarios in domains such as computer vision, natural language processing, and beyond. Concerning the type of task and the type of transformation, a large variety of new CNN architectures have been proposed so far as "Spherical Neural Networks" (l[ink](https://arxiv.org/pdf/1812.08434.pdf%2523:~:text=Graph%252520neural%252520networks%252520%28GNNs%29%252520are,its%252520neighborhood%252520with%252520arbitrary%252520depth.)), "Graph Neural Networks" (link), and "Steerable Neural Networks".

Steerable Neural Networks have garnered significant interest due to their unique ability to extend the capabilities of regular Convolutional Neural Networks (CNNs). These networks can be viewed as an evolution of CNNs, where the kernel is conditioned to satisfy specific constraints. While CNNs excel at being equivariant to translation, Steerable Neural Networks take it a step further by offering enhanced flexibility and capturing a wider range of transformations, such as rotation.

This tutorial will present an introduction to "Steerable Neural Networks" (S-CNNs), trying to convey an intuitive understanding of the mathematical concepts behind them and a step-by-step explanation on how to design these networks. __The tutorial is composed of two articles. This first article serves as an introduction to steerable neural networks (NNs), explaining their purpose and delving deeper into the concepts and formalism underlying S-CNNs. The second article (here) discusses at a high level the design of steerable filters and the steerable networks as overall.

This work aims at filling the gap between the current scientific literature and the wider data science audience. It is ideal for tech professionals as well as for researchers in this new branch of machine learning.

Example of a simple Steerable Neural Network taken from the paper [3]. As possible to see the effect of rotation in the input image is reflected to the the output response of the network.

The following papers are taken as reference:

[1] "3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data", Weilier et al., (link);

[2] "Steerable CNNs", Cohen et al. ( link);

[3] "Learning Steerable Filters for Rotation Equivariant CNNs",Weilier et al., (link)

[4] "General E(2)-Equivariant Steerable CNNs" Weilier et al., (link)

[5] "Scale Steerable Filters for the Locally Scale-Invariant Convolutional Neural Network", Ghosh et al. (link)

[6] "A program to build E(n)-equivariant steerable CNNs." Cesa et al. (link)

What are Steerable Neural Networks:

Steerable neural networks take their name from the particular type of filters they use. Such filters are called g-steerable filters and they have been inspired by the steerable filters which gained popularity in the image recognition field for edge detection or oriented texture analysis at the beginning of the 90's. Steerable means commonly dirigible, manageable, capable of being managed or controlled. Following this convention, the response of a steerable filters is orientable and adaptable to a specific orientation of the input (an image for example). Steerability is related to another very important property which is called Equivariance. In an equivariant filter, if the INPUT to the filter is transformed according to a precise and well-defined geometric transformation g (translation, rotation, shift), the OUTPUT (which results from the convolution of the INPUT with the filter) is transformed by the same transformation g. In general, equivariance does not require that the transformations (the one at the input and the one at the output) are the same. This concept will be better addressed in the next paragraph but for now it allows us to provide a first definition of steerable filter and steerable CNN.

A Steerable CNN filter can be defined as a filter whose kernel is structured as a concatenation of different steerable filters. These filters show equivariance properties in relation to the operation of convolution with respect to a set of well-defined geometric transformations.

As we can see later, the condition of equivariance on the convolution operation leads to the imposition of specific constraints over the structure of the kernel and on its weights. From this definition it is now possible to define what a steerable CNN is: Steerable Neural Networks are neural networks composed of a series of steerable filters.

What are S-CNN used for:

The strength of a normal CNN is in its equivariance to translation. However, Steerable NN's are more flexible and can show other types of transformations, rotation. In a rotationally equivariant problem, an unmodified CNN is compelled to learn rotated versions of the same filter introducing a redundant degree of freedom and increasing the risk of overfitting. For this reason, Steerable CNN networks can outperform the classical CNN by directly incorporating information about the geometrical transformations acting at the input. This property makes S-CNNs particularly useful for several challenging tasks where we have to process inputs that have a geometrical description and representation such as images, manifolds, or vector fields.

Possible practical applications are for example :

  • Challenging 2D image segmentation: __ predicting the cell boundaries given an input microscope image.
  • 3D model classification: classifying and recognizing 3D objects.
  • 3D chemical structure classification: predicting the 3D chemical structure of a molecule given its chemical structure. A possible example is the prediction of spatial preferences of a group of amino acids given its sequence as explained in section 5.4 of the paper [2].
Example of application of a a 3D steerable neural network for 3D object recognition. The input object (on the top) , and the representation of two different hidden layers feature maps. Taken from Link

Preliminary definitions and Context

After introducing Steerable Neural Networks and their applications, let's dive into the theory behind them. This section offers a more formal explanation of equivariance and steerability, providing essential definitions and a formal framework that will be instrumental in understanding the construction of steerable filters in the subsequent article. This article relies on an understanding of maps and geometrical transformations, for more information look on this other article.

1. EQUIVARIANCE:

Equivariance is a property of particular interest in symmetric problems. As stated before, in an equivariant model when the input is acted on by the transformation, it also acts on the output such that the application of the transformation can be applied before or after the model's application with no change in overall behaviour. In an everyday setting there are many examples of equivariance. For example, when driving, the direction in which a car steers when the wheel is turned is equivariant with respect to the direction the car is pointing. Formally, if we have a map

Tags: Artificial Intelligence Convolutional Network Deep Dives Deep Learning Geometric Deep Learning

Comment