Sunday, 20 June 2010

The difference between RGBM and RGBD

If you want to encode high dynamic range, but can't afford a high precision format (such as FP16), then RGBM is an great option. But what about it's often neglected brother, RGBD?

What is the difference, and when should they be used?
In this short post, I intend to highlight some of the subtle differences.

First, and overview of RGBM:
RGBM is an 8bit RGBA format, where Alpha is sacrificed to store a shared multiplier.

Decoding RGBM:
float3 DecodeRGBM(float4 rgbm)
return rgbm.rgb * (rgbm.a * MaxRange);

This produces a range of 0 to MaxRange.

Assuming MaxRange is 65025,
When M=1,   the range is 0,  1,  2,  3 ... 255,
When M=2, the range is 0, 2, 4, 6 ... 510,
When M=3, the range is 0, 3, 6, 9 ... 765,
When M=255, the range is 0,255,510,765 ... 65025.

Encoding RGBM:
float4 EncodeRGBM(float3 rgb)
float maxRGB = max(rgb.x,max(rgb.g,rgb.b));
float M = maxRGB / MaxRange;
M = ceil(M * 255.0) / 255.0;
return float4(rgb / (M * MaxRange), M);

RGBD follows the same rules as RGBM, however it stores a divider in Alpha, instead of a Multiplier.

Decoding RGBD:
float3 DecodeRGBD(float4 rgbd)
return rgbd.rgb * ((MaxRange / 255.0) / rgbd.a);

Encoding RGBD:
float4 EncodeRGBD(float3 rgb)
float maxRGB = max(rgb.x,max(rgb.g,rgb.b));
float D = max(MaxRange / maxRGB, 1);
D = saturate(floor(D) / 255.0);
return float4(rgb.rgb * (D * (255.0 / MaxRange)), D);

As can be seen, RGBD is a bit more complex in both the encode and the decode. The encode is trickier, as RGBD is a bit more sensitive to values outside of {0,MaxRange}.

But when should either be used?
On the face of it, RGBM looks like the logical choice. It's easier understand, and easier to use.
For the most part, it is the best choice. But not always. Here is why:

Distribution of values:

Neither format can store 65k distinct luminance values (16bits). However both have radically different distribution of the possible values they can store. The following images are gamma-space sample distributions for both formats:


If you are after good precision over the entire range, then RGBM will do. If you only want to store occasional highlights, RGBD may be a better option. The 0-1 range is clearly visible in the RGBD distribution (Assuming MaxRange was 256).
However, to take full advantage of both formats would require a much smarter encoder.


Both formats interpolate differently, and this may be the main reason you would not choose RGBM.
The following graphs demonstrate how the formats interpolate in two very different situations:

The first is a very simple transition from 255 to 256. In both formats, this requires the alpha value to change. For RGBM, the encoding is {255,1} and {128,2}. For RGBD, there are numerous combinations, however the graph shows {255,1} and {128,0.5}.


Here the primary limitation of RGBM is revealed. The interpolation between {255,1} and {128,2} is {192,1.5} - which multiplies to 287.25. Clearly not 255.5!
It's worth noting the RGBD interpolation isn't exactly right either.

For a larger interpolation, 100->1000, RGBD produces a less accurate result. In the following graph, the encoded values are {100,1} and {250,4} for RGBM, and {100,1} and {250,0.25} for RGBD.


Neither scheme produces a linear interpolation.

In summary, neither format is a clear winner. If you need subtle interpolation or good retention of detail in lower range, then consider taking the hit and choose RGBD over RGBM.

Tuesday, 11 May 2010

Film Squared

Sometimes you mess up, sometimes it looks good.

Previously I'd mentioned the film approximation tone mapping equation. film(colour).

Well, this is what sqrt(film(colour*colour)) looks like.
(Graph comparison)

Saturday, 8 May 2010

Monday, 19 April 2010

Approximating Film with Tonemapping


Please note:

I now realise that a bunch of this post is a complete load of rubbish.

- The film approximation does have nice black compression. It's just subtle, and doesn't show up in the linear graph (doh!)


Recently, I finished up a big tutorial for HDR rendering techniques.
This tutorial covered Tone Mapping and Gamma Correction.

If you are not aware of Gamma Correction or Tone Mapping:
Gamma Correction lets you compensate for the human eye's non-linear perception of light. To put it simply, if you double the brightness of the pixel on screen, you actually more than quadruple the intensity of the light output by your monitor!

Tone Mapping is how you compress the huge visible range of light into something more suitable to display on a limited contrast monitor. It varies by game, and is often the most important factor in determining the 'look' of a game.

Originally, I had two Tone Mapping methods available in the tutorial: Exponential Exposure (my personal favorite) and the function (x / (x+1)), which is commonly used - including samples in the DirectX SDK.

Broadly speaking, these both produced very similar results - with exp producing a more saturated result. However, they both suffered from loss of saturation the brighter the image got.

Then one day a friend introduced me to the light response curve for Film.
He showed me the this image, which is the response curve of Kodak's high end motion picture film.
Also, he introduced me to the following approximation, which he said was a close match to the kodac film:

rgb = Math.Max(0, input - 0.004);
output = rgb*(0.5+6.2*rgb)/(0.06+rgb*(1.7+6.2*rgb));

It even approximated gamma correction. Amazing!
I quickly implemented it as a 3rd tone mapping option. The results were good, the curve maintained saturation and produced really clear blacks.

(I later discovered this function - and that graph - were both from Naughty Dog's excellent paper 'Uncharted 2: HDR Lighting')

For comparison, here are the different tone mapping graphs (with gamma correction):

x / (x + 1):

x / (x+1)

exponential exposure:

exponential exposure

film approximation:

film approximation

As can be seen, the film approximation is more 'punchy' than the previous two. However, looking closer, I was a bit disappointed:

Deep blacks of the film approximation:

Deep blacks of the film approximation

I was expecting something similar to the actual film response curve, with compressed blacks.
Now, perhaps I'm simply not using the right magic numbers - but unfortunately I cannot find the original source for the approximation.

I wasn't entirely happy with this, so I set about improving it.
I didn't want a cutoff to black, I wanted a nice transition into compressed blacks - similar to the film curve.

This is the revised approximation:

const float cutoff = 0.025;
rgb += (cutoff * 2 - rgb) * saturate(cutoff * 2 - rgb)
* (0.25f / cutoff) - cutoff;
output = rgb*(0.5+6.2*rgb)/(0.06+rgb*(1.7+6.2*rgb));

It looks more complex than it is.
Deep blacks of the film approximation, using black compression:

Deep blacks of the film approximation using black compression

That's much better!

The big upshot of this is that the cutoff value can be boosted significantly.
Here are a set of images comparing black compression with a boosted cutoff:

Exposure tone mapping:

Exposure Tone Mapping
Film Approximation
Film Approximation + Cutoff
Film Approximation + Black compression

Exponential Exposure Tone Mapping lacks saturation. The Film Approximation is much better, but still looks a bit desaturated in the shadows. Turning on the cutoff helps, but destroys detail in the deep blacks. Turning on black compression isn't as aggressive as the cutoff, but retains both saturation and deep black detail.

(Note, this demonstration is highly dependent on your monitor!)

Thursday, 15 April 2010

Contemplating future Programming Languages, part 1

Here is a tricky question: What will the average CPU look like in 2020?

Will it be,
A: An evolution of current desktop CPUs? A number of powerful cores, all very good at maintaining high efficiency.

B: An evolution of console CPUs/Larrabee? A large number of cores that are very fast, but struggle with sub-optimal code.

C: An evolution of the GPU? A massive number of simple cores that are incredibly fast, but demand highly task and data parallelism to run at good levels of efficiency.

D: A hybrid of all three? Or how about something completely different?

It's pretty clear that the number of threads will be huge. Today, a Core i7 Gulftown runs 12 threads. A 32 core Larrabee would run 128 threads. A modern GPU runs 1,000s of threads.
But how wide will the pipe be? Each Larrabee core sports a 16+1 vector/scalar unit. And what about cache coherency?

I don't know the answer to any of these questions. But I have come to a conclusion:

Ultimately, how the hardware evolves isn't that important.
What keeps me up at night is how you drive that hardware...

Friday, 9 April 2010

Is this thing on?

As a programmer, there is nothing better than an obsessive attention to detail.
The ability to make logical connections between cause and effect, often from the smallest morsel of information is simply invaluable.

As a person trying to get to sleep, there is nothing worse than an obsessive attention to detail.
The ability to make illogical connections between cause and effect, often from the smallest morsel of information is simply infuriating.

Often your most valuable abilities can also be the hardest to live with.
Sometimes when stuck in an infinite loop you only have two options; burn out or write down what is on your mind.