A Comprehensive Overview of Gaussian Splatting

Author:Murphy  |  View: 21481  |  Time: 2025-03-22 23:37:17

Everything you need to know about the new trend in the field of 3D representations

Gaussian splatting is a method for representing 3D scenes and rendering novel views introduced in "3D Gaussian Splatting for Real-Time Radiance Field Rendering"¹. It can be thought of as an alternative to NeRF²-like models, and just like NeRF back in the day, Gaussian splatting led to lots of new research works that chose to use it as an underlying representation of a 3D world for various use cases. So what's so special about it and why is it better than NeRF? Or is it, even? Let's find out!

Table of contents:

TL;DR

First and foremost, the main claim to fame of this work was the high rendering speed as can be understood from the title. This is due to the representation itself which will be covered below and thanks to the tailored implementation of a rendering algorithm with custom CUDA kernels.

Figure 1: A side-by-side comparison of previous high-quality representations and Gaussian Splatting (marked as "Ours") in terms of rendering speed (fps), training time (min), and visual quality (Peak signal-to-noise ratio, the higher the better) [Source: taken from [1]]

Additionally, Gaussian splatting doesn't involve any neural network at all. There isn't even a small MLP, nothing "neural", a scene is essentially just a set of points in space. This in itself is already an attention grabber. It is quite refreshing to see such a method gaining popularity in our AI-obsessed world with research companies chasing models comprised of more and more billions of parameters. Its idea stems from "Surface splatting"³ (2001) so it sets a cool example that classic computer vision approaches can still inspire relevant solutions. Its simple and explicit representation makes Gaussian splatting particularly interpretable, a very good reason to choose it over NeRFs for some applications.

Representing a 3D world

As mentioned earlier, in Gaussian splatting a 3D world is represented with a set of 3D points, in fact, millions of them, in a ballpark of 0.5–5 million. Each point is a 3D Gaussian with its own unique parameters that are fitted per scene such that renders of this scene match closely to the known dataset images. The optimization and rendering processes will be discussed later so let's focus for a moment on the necessary parameters.

Figure 2: Centers of Gaussian (means) [Source: taken from Dynamic 3D Gaussians⁴]

Each 3D Gaussian is parametrized by:

Comment