Sunday, August 7, 2011

JOGL Part 2 - The Deferred Renderer

In this part I will write about the deferred renderer I use and how I configured it. But first things first: what is deferred rendering? It's an alternative to the "standard" forward rendering method. When I first used OpenGL I (and I think many others) used this very simple method:

foreach Object o
  render Object o with all lights
 end
 
Simple Forward Rendering

Every object gets drawn with a shader that handles every light source in the scene. A major drawback of this method is that its very hard to handle large amounts of different lights (point-, spot-, ..., directional lights). The shader will soon have a loop around many "ifs" to determine the correct light formula. That's not very optimal. It gets even more complicated when shadows are added to the scene.

One way to simplify this is to use a multi-pass method. Objects will be drawn multiple times with different shaders and lights. This could look something like this:

foreach Object o
   foreach Light l
    if o is in lightradius of l
     render Object o with Light l
    end
   end
  end
  
Iterating over objects
foreach Light l
  foreach Object o in lightradius of l
   render Object o with Light l
  end
 end
  
Iterating over lights

This method simplifies the shader management. You can have different shaders for different light sources. But there are still drawbacks. Both methods use a nested loop for lights and objects - if one light or object is added to the scene, everything has to be drawn again.

Deferred rendering separates the drawing and lighting of objects into two passes. First the objects are drawn into an off-screen buffer. Then this buffer is lit to achieve the final scene. In code terms, it's something along those lines:

// Phase 1: fill G-Buffer
 foreach Object o
  render Object o into G-Buffer
 end
 
 // Phase 2: light Scene
 foreach Light l
  render Light l onto Screen
 end
 
Deferred Rendering Overview

First Pass: The G-Buffer

The Geometry Buffer (G-Buffer) is the heart of this technique. It stores the information later needed to light the scene in every pixel on the screen. The first pass renders the object properties into this buffer. Then in a second pass the scene is lit. This second pass does not directly light the objects but only the contents of the G-Buffer. So what is the G-Buffer and what has to be stored inside of it?

The G-Buffer consists of multiple screen-aligned textures to hold every bit of information needed for the second pass which applies light to every pixel of the newly constructed buffer. If you look at the light equations (diffuse, specular) you will see, that this includes the normals, material color but also the position of the objects. This is needed as any information not stored in the G-Buffer is not available in phase 2. This is one of the shaders I use to store the object information:

varying vec3 normal;
 varying vec3 position;
 varying vec2 texCoord;

 uniform sampler2D colorMap;

 void main() {
  gl_FragData[0] = vec4(normalize(normal), gl_FrontMaterial.shininess);
  gl_FragData[1] = vec4(texture2D(colorMap, texCoord).rgb * gl_FrontMaterial.diffuse.rgb, 1.0);
  gl_FragData[2] = vec4(length(position));
 }
 
G-Buffer Shader for a textured object

Different shaders are used, depending on the object; does it have textures, a special specular texture, does it use bump mapping, ... ? You can see that I use three textures to store the needed information. Multiple Render Targets (MRTs) are used to draw into multiple textures in one pass. The G-Buffer is aligned as this (every application uses a different layout, depending on the requirements):

G-Buffer Layout

I'm not totally satisfied with this layout as the textures have different sizes and the normals are stored as 3 16bit values. It's possible to compress the normals as seen here. So while this is not an optimal layout, it works. I also had quite some trouble storing the position. The first attempt was to store the position as 3 32bit values. It does not take much time to implement but that was simply too big.

Then I have thrown away the position in the G-Buffer altogether and directly used the Depth-Buffer as a bound texture. If the position was needed, I would simply reconstruct the values from the Depth-Buffer by multiplying them with the inverse projection matrix and some magic. I tested this successfully on my desktop computer but as soon as I started it on my laptop it gave wrong results. This had something to do with the Shadow Volumes technique on ATI (see here). Another bigger problem was that reading a combined Depth-Stencil Texture on my laptop gave corrupt values. As the only information I found about that specific problem was here, I gave up on this approach.

To find a new method I read through all the parts of this. As I had to reconstruct the depth without any knowledge about the far plane (another restriction that comes with the Shadow Volume technique) I decided to store the length of the vector from the camera to the pixel. When reconstructing the position in the lighting pass I create a vector from the camera to the pixel on the nearplane and extend this vector by the length stored in the G-Buffer. The content of the G-Buffer is visualized in the following pictures:

Albedo
Encoded Position
Normals
Specular Intensity/Power
Composed Image

Second Pass: Lighting

The G-Buffer is now filled with the complete information needed to light the scene. Every pixel stores the object position, normal and material properties. The easiest way to apply the light to the scene is to render a full-screen quad with enabled light shader for each light. The light equation can be applied to every pixel on the screen and the result of every light can be blended together.

This is not very elegant as most lights are not affecting every pixel on the screen. To increase performance it is necessary to reduce the amount of shader computations on pixels that are known to be outside of a lights influence region. One way is to render a light volume around the light to determine the lit pixels. This doesn't work with the default light attenuation equations as the lights intensity never reaches zero, so I use a different model with a fixed light radius. Then a light volume is drawn around the light to determine the lit pixels. I use the Z-Fail technique that is commonly used with Shadow Volumes to achieve this. The following images demonstrate the method:

This way the stencil buffer marks exactly the pixels that should be lit. The next step is to draw a quad with the light's shader to apply light to these pixels. I project the sphere of the point light onto the nearplane and draw a quad there. Every type of light source needs another light volume. A sphere for a point light, a cone for a spot light, a pipe like volume for "line lights" and no volume for directional lights. As I fiddled around with different light types, I noticed that a neon lamp like light would make it much easier to light the scene. Since the complete lighting is a separate step and each light can use their own shader, it was not very hard to build such a light type without falling back on using many pointlights in a line.

To get around the inefficient use of "glClear(GL_STENCIL_BUFFER_BIT)" it is only called once to initialize the lighting pass. After a light volume is drawn into the stencil buffer and the lighting has been applied, it is simply drawn again to "reset" the stencil buffer to the default value. I realized that this was much faster that clearing after every light. Here you can see a shader I use for a pointlight (some variables omitted):

vec3 calculateColor(in vec3 position, in vec3 N, in vec3 materialColor, in float shininess, in float specularPower) {
 vec3 lightDir = vec3(gl_LightSource[0].position.xyz - position);
 
 float d = length(lightDir);
 float attenuation = calculateAttenuation(lightRadius, d);
 
 vec3 lightColor = vec3(0.0, 0.0, 0.0);
 if (d > lightRadius) {
  return lightColor;
 }
 
 vec3 L = normalize(lightDir);
 float NdotL = dot(L, N);
 
 if (NdotL > 0.0) {
  // diffuse light
  lightColor += calculateDiffuseTerm(NdotL, gl_LightSource[0].diffuse.rgb);
  
  // specular light
  lightColor += calculateSpecularTerm(position, gl_LightSource[0].specular.rgb, L, N, shininess) * specularPower;
 }
    
 return lightColor * materialColor * attenuation;
}

void main() {
 vec4 normalTex = texture2D(normalMap, texCoord);
 vec3 position = getEyePositionFromDepthMap(positionMap, texCoord, nearPlane);
 vec4 color = texture2D(colorMap, texCoord);
 
 gl_FragColor = vec4(calculateColor(position, normalTex.rgb, color.rgb, normalTex.a, color.a), 1.0);
}
 
Pointlight shader

Final Words

For me deferred rendering is a more "natural" way of lighting the scene. There is a strict separation of the objects and the lightsources. The object shaders do not have to bother with the lighting which really simplifies the object rendering. Lighting is not achieved by rendering the objects but by rendering the lights. As each light is drawn by itself, a specialized shader can be used which in addition also simplifies the lighting shaders. Bump Mapping is easy to implement as it only requires a new object shader which correctly writes the normals to the G-Buffer. No changes to the lights are needed.

But the deferred rendering technique is not the holy grail of computer graphics. As always there are special cases and limitations. A simple example: the skybox. Since the skybox should be drawn without lighting it has to be treated differently. Transparent objects are difficult to handle since the G-Buffer only contains Objects that are filly opaque. The default solution is to draw the transparent objects with a forward renderer. Also to achieve antialiasing you have to look for different methods since the G-Buffer texture is not antialiased. There are other ways of techniques like Deferred Lighting or Inferred Lighting that try to solve some of the problems but I have no information about them.

While there's a lot more to an engine than simply using deferred or forward rendering I am still excited to see which method establishes itself as a "default" in the future. On the one hand games like Crysis (link, link), GTA IV (modded version) or StarCraft 2 are using some kind of deferred renderer and you can't say that they look ugly or aren't performing well. On the other hand a big representative of the traditional way is the Unreal Engine. They also showed a while ago that their way is still working very well.

Further Links

On my way to building a deferred renderer a stumbled over a few links that I want to share (no particular order):

No comments:

Post a Comment