Fur Shader Passes: Difference between revisions

From Warren's Fast Fur Shader
Jump to navigation Jump to search
 
(15 intermediate revisions by the same user not shown)
Line 1: Line 1:
My shader draws the fur as geometric layers, stacked on top of each other.
== Rendering Passes ==
== Rendering Passes ==
GPU programs (ie. "shaders") render objects in passes. Each pass runs through the shader pipeline, from start to finish (ie. from mesh to pixels), before another pass can start.
Fast Fur is a 3-pass shader. Each pass runs from start to finish before the next pass starts.
 
Fast Fur requires 3 mandatory passes per frame, with shadows and complex light sources each requiring 1 more.
 
Why not only 1 mandatory pass?
 
Because 3 passes is faster!


== Pass 1 - The Skin ==
== Pass 1 - The Skin ==
The first version of my shader did everything in 1 pass, because that's what other fur shaders were doing.
Many lighting effects, such as proper specular lighting and normal maps, are simply too slow to calculate for every layer of fur. As a compromise, my shader only renders the skin "expensively".
 
I was then asked to add some expensive lighting effects, like specular lighting and normal maps. I did, and the shader's speed dropped by more than half. Yikes!
 
The compromise I decided to try rendering only the skin "expensively". To do that, I had to split the Skin and Fur off into a separate passes. Then I speed-tested it against the original, simpler, 1-pass version.
 
Guess which one was faster?
 
The more complex, "expensive" looking version.
 
...
 
Um, wut?
 
That wasn't what I was expecting, lol..., but I'll take it!
 
As far as I can tell (and this really is just a guess!), rending the opaque skin at the same time as the mostly see-through fur results in the GPU texture caches missing a lot. So it's actually faster to render the skin completely separately, in its own pass, than to try to draw it at the same as the other fur layers.
 
The speed increase is so great, in fact, that it more than offsets the speed decrease of rending the skin "expensively". So it was a win-win discovery for both speed and quality!
 
== Pass 2 - The Undercoat ==
One of the speed tricks added in v2.0 was to move the skin outwards as you got further away. You can't see the bottom layers of fur when you get farther away, so moving the skin outwards allowed the shader to skip rendering them.
 
It worked great! It was like a ~30% speed boost with no noticable loss of quality!
 
Except for the seams... yeah, those are very much a noticable loss of quality
 
Moving the skin outwards creates gaps if your UV seams don't line up perfectly, which virtually nobody's does.
 
On most avatars, it's okay, because the fur is usually thick enough where the seams are that it hides them pretty well.
 
But for a lot of people, the seams are really, really bad. For them, the option to turn, "Body Expansion when Far" off was added, but that means ~30% lower quality because of lower layer density.
 
The compromise I decided to try was to render the bottom fur layer as opaque and move that outwards, instead of the skin.


Visually, this works, because the skin fills in any visible seams.
However, an unexpected benefit of rendering the skin separately is that it actually renders faster than doing everything in 1 pass. As far as I can tell (and this is really just a guess!), trying to render the opaque skin at the same time as the mostly see-through fur results in the GPU caches missing a lot.


Performance-wise, it doesn't, likely for the same reason it was slower to render the skin at the same time as the fur. They are just too different looking.
The speed increase is so great, in fact, that 16 "cheap" fur layers + 1 "expensive" skin layer renders significantly faster than 15 "cheap" fur layers + 1 "cheap" skin layer.


So the Undercoat pass was added.
== Pass 2 - The Fur ==
Almost nothing is rendered "correctly" by the fur layers. They take every short-cut possible, resulting in all sorts of lighting errors and glitches (if you know what to look for).


== Pass 3 - The Fur ==
However, speed is absolutely essential, because these layers make up the vast majority of the render time, and rendering lots of "cheap" layers looks significantly better than rendering only half as many "expensive" layers.
Almost nothing is rendered "correctly" by the fur layers. They take every short-cut possible, resulting in all sorts of lighting errors and glitches.


However, speed is absolutely essential, because these layers make up the vast majority of the render time. Also, more speed means more layers can be rendered per frame, allowing the brute-force tactic of throwing more resolution at the screen to make the fur look better.
The shader also has a couple of tricks up its sleeve:


The fur also has a couple of tricks up its sleeve:
* Trick #1: it can turn the fur layers off (if too far away, offscreen, or backwards) by having the Hull shader ask the Tessellator to multiply the number of triangles by 0, which throws them out before they reach the Geometry shader stage. This is a MASSIVE speed boost, roughly doubling the speed of the shader.


Sneaky trick #1: it can turn itself off by having the Hull shader ask the Tessellator to multiply the number of triangles by 0, which throws them out. Any off-screen, far-away, or backwards-facing triangles are thus discarded before the expensive Geometry shader stage runs. Otherwise, it tells the Tessellator to multiply the number of trianges by 1, which does nothing. The triangle is simply passed to the Geometry shader and thus the fur is rendered normally.
* Trick #2: it can have the Hull shader ask the Tessellator split each triangle into 4 when taking a photo. However, the Domain shader then ignores the Tessellator's instructions, and instead just makes 4 copies of the original, un-tessellated triangle. The fur can thus be rendered at up to 4x resolution, but only when needed.


Sneaky trick #2: it can have the Hull shader ask the Tessellator split each triangle into 4 when taking a photo. However, the Domain shader then ignores the Tessellator, and instead just makes 4 copies of the original, un-tessellated triangle. The fur can thus be rendered at up to 4x resolution, but only when needed.
== Pass 3 - The Overcoat ==
Added in v5.0, the overcoat is a translucent layer of fur that acts as a cheap visual stand-in for actual fur when the avatar is far away. It allows the shader to turn the "real" fur layers off completely, which results in a significant speed boost.


== Fur Passes - "furFadeIn" ==
At close range the overcoat is not visible, but as distance increases the overcoat fades in and expands outwards to about 2/3rds the thickness of the fur.
At close range, the variable "furFadeIn" is at 1.0, and it gradually goes to 0.0 at the furthest range. The 3 passes react to "furFadeIn" as follows:
{| class="wikitable"
|+
!"furFadeIn"
!Skin
!Undercoat
!Fur
|-
|1.0
|At 0% height
|Is bottom hair layer (ie. see-through)
|Renders all hair layers, except bottom
|-
|1.001 -> 0.199
|At 0% height
|Is opaque and expands towards 35% height as "furFadeIn" increases.
|Layer density gradually decreases
|-
|0.2 -> 0.001
|At 35% height
|Is translucent and expands towards 50% height as "furFadeIn" increases.  
|Layer density gradually decreases and gets squeezed together
|-
|0.0
|At 35% height
|Is translucent and at 50% height
|Is not rendered
|}

Latest revision as of 13:39, 1 January 2024

Rendering Passes[edit | edit source]

Fast Fur is a 3-pass shader. Each pass runs from start to finish before the next pass starts.

Pass 1 - The Skin[edit | edit source]

Many lighting effects, such as proper specular lighting and normal maps, are simply too slow to calculate for every layer of fur. As a compromise, my shader only renders the skin "expensively".

However, an unexpected benefit of rendering the skin separately is that it actually renders faster than doing everything in 1 pass. As far as I can tell (and this is really just a guess!), trying to render the opaque skin at the same time as the mostly see-through fur results in the GPU caches missing a lot.

The speed increase is so great, in fact, that 16 "cheap" fur layers + 1 "expensive" skin layer renders significantly faster than 15 "cheap" fur layers + 1 "cheap" skin layer.

Pass 2 - The Fur[edit | edit source]

Almost nothing is rendered "correctly" by the fur layers. They take every short-cut possible, resulting in all sorts of lighting errors and glitches (if you know what to look for).

However, speed is absolutely essential, because these layers make up the vast majority of the render time, and rendering lots of "cheap" layers looks significantly better than rendering only half as many "expensive" layers.

The shader also has a couple of tricks up its sleeve:

  • Trick #1: it can turn the fur layers off (if too far away, offscreen, or backwards) by having the Hull shader ask the Tessellator to multiply the number of triangles by 0, which throws them out before they reach the Geometry shader stage. This is a MASSIVE speed boost, roughly doubling the speed of the shader.
  • Trick #2: it can have the Hull shader ask the Tessellator split each triangle into 4 when taking a photo. However, the Domain shader then ignores the Tessellator's instructions, and instead just makes 4 copies of the original, un-tessellated triangle. The fur can thus be rendered at up to 4x resolution, but only when needed.

Pass 3 - The Overcoat[edit | edit source]

Added in v5.0, the overcoat is a translucent layer of fur that acts as a cheap visual stand-in for actual fur when the avatar is far away. It allows the shader to turn the "real" fur layers off completely, which results in a significant speed boost.

At close range the overcoat is not visible, but as distance increases the overcoat fades in and expands outwards to about 2/3rds the thickness of the fur.