This is the first in a series of posts that will follow our past, present, and future work in the area of ray tracing. We'll outline each of the techniques we have developed over the last 3 years and explain how each has helped us reach our end goal of real-time ray-traced content that is accessible on any device anywhere in the world. But before we delve into that, we'll start with a quick introduction to what we mean by ray tracing.
What is ray tracing?
Ray tracing is a technique used in computer graphics to realistically recreate how light interacts with objects and materials in the real world. The colour of each pixel is calculated by casting rays from the surface of that object into the rest of the scene to calculate how much light reaches that pixel.
Ray tracing has been used for many years in visual effects. Primarily, its purpose has been to help produce realistic special effects for movies or enhance renders of products that are not yet available to photograph. However, this process is painstakingly slow; in order to create a realistic image, many rays must be traced. It typically take hours - if not days - to create just a single frame of content via this process.
This is where real-time rendering comes in. This rendering technique was pioneered by the gaming industry, and it basically means that each frame of content shown to the user is rendered there and then without delay. This gives content creators far more flexibility and agility, and it means the content can be fully interactive and responsive.
To make real time work, the software must be able to render a new frame every few milliseconds; this is typically 33ms (for 30fps) or 16ms (for 60fps). Pixels are drawn by rasterising polygons, using a customisable shader to achieve the desired look. The ability for a GPU to process thousands of pixels in parallel gives real-time rasterisation its speed but also restricts the quality of the image due to each pixel having limited knowledge about the rest of the scene.
As computers have become more powerful, the techniques used simulate reality have become more complex, which means offline-rendered content is often indistinguishable from reality. Similarly, real-time rendering has also entered a new era as GPUs have become powerful enough to run ray-traced solutions in real time.
Sparse Voxels Reflections
At ZeroLight, our first project that involved ray tracing was back in 2017, when Audi asked us to take the showroom configurator we had built for them to the next level. They delivered a selection of offline-rendered ray-traced images and asked us to find a way to match the quality and details in real time.
There were many new techniques we implemented to get the materials and lighting to match the quality of these images while ensuring that our simulation could still maintain a solid 60 frames per second at a resolution of 4K. One of the most complex material properties to recreate was the reflection of objects within their own surfaces. The most obvious instance of this on a vehicle are the wing mirrors and the glass of the interior.
To address this, we used screen-space reflections (SSR), which is a technique that is used in many modern games. It's typically implemented as a post effect that uses the depth and the final screen image. With this data you can ray march over the image to find a rendered pixel of an object that would have been reflected in that surface. The major benefit of this technique is that it is relatively inexpensive as rendered pixels are reused rather than recomputed. However, a significant problem arises when pixels that should be reflected are not visible in the main scene; this results in missing data and pixelated artefacts.
We decided that even though this works for games it would not be acceptable for Audi's premium content. This drove us to develop a new technique that would allow all reflections to be accurate at a high frame rate.
Voxel Reflections
Voxels are basically a 3D version of pixels: they're a uniform grid of 3D data. The way the data is arranged makes it very efficient for the GPU to quickly locate a reflected voxel when shading each pixel. This speed is essential to maintain high frame rate.
The voxels are generated when a new vehicle configuration or environment is loaded. This baking process takes a couple of seconds to generate using a geometry shader to transform and write the rasterised pixel shader lighting values into the voxel buffer. The voxel data is added to an octree (a data tree with internal nodes containing eight children) that makes the reflection render pass very efficient, because all complex lighting data is pre-calculated for each voxel. The efficiency of the look up into this data enables the real-time ray tracing experience to run at resolutions up to 4K.
In order to have detailed reflections, the voxel data could be generated at different levels of detail; each level would contain 8x as much data as the previous level.
Level 11
To visualise the fine details in a car's dashboard in the reflections in the windows and chrome features, we had to have a very high density of voxels. At octree level 11 (11 layers within the tree structure), we had over 8.5 million points of data. Even with the data compressed over 15GB of video memory was required to store all this data.
A lot of these data didn't actually contain any information, as many areas of the volume had no intersection with the car's geometry. Because of this, we managed to reduce the memory overhead to 2GB by changing the octree to store the data in a sparse hierarchy, which meant that these empty areas no longer required data. This process did increase the generation cost, but the reduced memory footprint enabled the level-11 quality to be a feasible solution.
The performance of this technique was outstanding and produced very accurate results with a low GPU cost per frame. The main issue was the generation time to create the sparse voxel tree; this overhead meant it was too complex to support dynamic objects.
GTC 2018
We first demoed this tech alongside NVIDIA and Audi at GTC 2018. We used dual SLI Volta V100 GPUs to run the simulation at 4K and 60 frames per second. Take a look at the results in the video below!
The next blog in this series will outline how we built on this demo to create GPU Hybrid Ray Tracing.