Can someone explain what a splat is? I did graphics programming 25 years ago, but haven't touched it since. I don't think I've ever heard this word before.
Can someone explain what a splat is? I did graphics programming 25 years ago, but haven't touched it since. I don't think I've ever heard this word before.
A splat is basically a point in a point cloud. But instead of a gaussian splat being just a point it isn’t infinitesimally small but like a 3d gaussian where the point represent the mean. It also has color and opacity. You can also stretch it like an ellipsoid instead of having it have perfect radial symmetry.
And in case it helps further in the context of the article: traditional rendering pipelines for games don't render fuzzy Gaussian points, but triangles instead.
Having the model trained on how to construct triangles (rather than blobbly points) means that we're closer to a "take photos of a scene, process them automatically, and walk around them in a game engine" style pipeline.
Any insights into why game engines prefer triangles rather than guassians for fast rendering?
Are triangles cheaper for the rasterizer, antialiasing, or something similar?
Cheaper for everything, ultimately.
A triangle by definition is guaranteed to be co-planer; three vertices must describe a single flat plane. This means every triangle has a single normal vector across it, which is useful for calculating angles to lighting or the camera.
It's also very easy to interpolate points on the surface of a triangle, which is good for texture mapping (and many other things).
It's also easy to work out if a line or volume intersects a triangle or not.
Because they're the simplest possible representation of a surface in 3D, the individual calculations per triangle are small (and more parallelisable as a result).
Triangles are the simplest polygons, and simple is good for speed and correctness.
Older GPUs natively supported quadrilaterals (four sided polygons), but these have fundamental problems because they're typically specified using the vertices at the four corners... but these may not be co-planar! Similarly, interpolating texture coordinates smoothly across a quad is more complicated than with triangles.
Similarly, older GPUs had good support for "double-sided" polygons where both sides were rendered. It turned out that 99% of the time you only want one side, because you can only see the outside of a solid object. Rendering the inside back-face is a pointless waste of computer power. This actually simplified rendering algorithms by removing some conditionals in the mathematics.
Eventually, support for anything but single-sided triangles was in practice emulated with a bunch of triangles anyway, so these days we just stopped pretending and use only triangles.
As an aside, a few early 90s games did experiment with spheroid sprites to approximate 3D rendering, including the DOS game Ecstatica [1] and the (unfortunately named) SNES/Genesis game Ballz 3D [2]
[1] https://www.youtube.com/watch?v=nVNxnlgYOyk
[2] https://www.youtube.com/watch?v=JfhiGHM0AoE
>triangles cheaper for the rasterizer
Yes, using triangles simplifies a lot of math, and GPUs were created to be really good at doing the math related to triangles rasterization (affine transformations).
Yes cheaper. Quads are subject to becoming non-planar leading to shading artifacts.
In fact, I belive that under the hood all 3d models are triangulated.
Yes. Triangles are cheap. Ridiculously cheap. For everything.
So instead of a point, a splat is more of a (colored) cloud itself.
So a gaussian splat scene is not a pointcloud but rather a cloudcloud.
> so a gaussian splat scene is not a pointcloud but rather a cloudcloud.
A good way of putting it.
I always assumed "gaussian splatting" was a reference to old school texture splatting, where textures are alpha-blended together. AFAIK the graphcis terminology of splats as objects (in addition to splatting as a operation) is new.
Practically, what differentiateS a splat from standard photogrammetry is that it can capture things like reflections, transparency and skies. A standard photogram of (for example) a mirror would confuse the reflection in the mirror for a space behind the mirror. A photogram of a sheet of glass would likewise suffer.
The problem is that any tool or process that converts splats into regular geometry produces plain old geometry and RGB textures, thus loosing its advantage. For this reason splats are (in my opinion) a tool in search of an application. Doubtless some here will disagree.
I've never been quite clear on how Splats encode specular (directional) effects. Are they made to only be visible from a narrow field of view (so you see a different splat for different view angles?) or do they encode the specular stuff internally somehow?
This is a good question. As I understand it, the only material parameters a splat can recognize are color and transparency. Therefore the first of your two options would be the correct one.
You can use spherical harmonics to encode a few coefficients in addition to the base RGB for each splat such that the rendertime view direction can be used to compute an output RGB. A "reflection" in 3DGS isn't a light ray being traced off the surface, but instead a way of saying "when viewed from this angle, the splat may take an object's base color, while from that angle, the splat may be white because the input image had glare"
This ends up being very effective with interpolation between known viewpoints, and hit-or-miss extrapolation beyond known viewpoints.
Because you have source imagery and colors (and therefore specular and reflective details) from different angles you can add a view angle and location based component to the material/color function; so the material is not just f(point in 3d space) it’s f(pt, view loc, view direction). That’s made differentiable and so you get viewpoint dependent colors for ‘free’.
The prior paper by the authors spends more time explaining what's happening, would start there:
https://convexsplatting.github.io/
the seminal paper is still this one:
https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/
To add to the rest of the replies: color comes from spherical harmonics, which I'm sure you came across them (used traditionally for diffuse light or shadows, SuperTuxKart uses them)
It's basically a blob in space. When you have millions of them, you can use gradient descent to minimize loss between them and source images.