Fur Shader Passes: Difference between revisions
Line 53: | Line 53: | ||
== Pass 3 - The Fur == | == Pass 3 - The Fur == | ||
Almost nothing is rendered "correctly" by the fur layers. They take every short-cut possible, resulting in all sorts of lighting errors and glitches. | Almost nothing is rendered "correctly" by the fur layers. They take every short-cut possible, resulting in all sorts of lighting errors and glitches (if you know what to look for). | ||
However, speed is absolutely essential, because these layers make up the vast majority of the render time. Also, more speed means more layers can be rendered per frame, allowing the brute-force tactic of throwing more resolution at the screen to make the fur look better. | However, speed is absolutely essential, because these layers make up the vast majority of the render time. Also, more speed means more layers can be rendered per frame, allowing the brute-force tactic of throwing more resolution at the screen to make the fur look better. |
Revision as of 19:43, 12 September 2023
My shader draws the fur as geometric layers, stacked on top of each other.
Rendering Passes
GPU programs (ie. "shaders") render objects in passes. Each pass runs through the shader pipeline, from start to finish (ie. from mesh to pixels), before another pass can start.
Fast Fur requires 3 mandatory passes per frame, with shadows and complex light sources each requiring 1 more.
Why not only 1 mandatory pass?
Because 3 passes is faster!
Pass 1 - The Skin
The first version of my shader did everything in 1 pass, because that's what other fur shaders were doing.
I was then asked to add some expensive lighting effects, like specular lighting and normal maps. I did, and the shader's speed dropped by more than half. Yikes!
The compromise I decided to try rendering only the skin "expensively". To do that, I had to split the Skin and Fur off into a separate passes. Then I speed-tested it against the original, simpler, 1-pass version.
Guess which one was faster?
The more complex, "expensive" looking version.
...
Um, wut?
That wasn't what I was expecting, lol..., but I'll take it!
As far as I can tell (and this really is just a guess!), rending the opaque skin at the same time as the mostly see-through fur results in the GPU texture caches missing a lot. So it's actually faster to render the skin completely separately, in its own pass, than to try to draw it at the same as the other fur layers.
The speed increase is so great, in fact, that it more than offsets the speed decrease of rending the skin "expensively". So it was a win-win discovery for both speed and quality!
Pass 2 - The Undercoat
One of the speed tricks added in v2.0 was to move the skin outwards as you got further away. You can't see the bottom layers of fur when you get farther away, so moving the skin outwards allowed the shader to skip rendering them.
It worked great! It was like a ~30% speed boost with no noticable loss of quality!
Except for the seams... yeah, those are very much a noticable loss of quality
Moving the skin outwards creates gaps if your UV seams don't line up perfectly, which virtually nobody's does.
On most avatars, it's okay, because the fur is usually thick enough where the seams are that it hides them pretty well.
But for a lot of people, the seams are really, really bad. For them, the option to turn, "Body Expansion when Far" off was added, but that means ~30% lower quality because of lower layer density.
The compromise I decided to try was to render the bottom fur layer as opaque and move that outwards, instead of the skin.
Visually, this works, because the skin fills in any visible seams.
Performance-wise, it doesn't, likely for the same reason it was slower to render the skin at the same time as the fur. They are just too different looking.
So the Undercoat pass was added.
Pass 3 - The Fur
Almost nothing is rendered "correctly" by the fur layers. They take every short-cut possible, resulting in all sorts of lighting errors and glitches (if you know what to look for).
However, speed is absolutely essential, because these layers make up the vast majority of the render time. Also, more speed means more layers can be rendered per frame, allowing the brute-force tactic of throwing more resolution at the screen to make the fur look better.
The fur also has a couple of tricks up its sleeve:
Sneaky trick #1: it can turn itself off by having the Hull shader ask the Tessellator to multiply the number of triangles by 0, which throws them out. Any off-screen, far-away, or backwards-facing triangles are thus discarded before the expensive Geometry shader stage runs. Otherwise, it tells the Tessellator to multiply the number of trianges by 1, which does nothing. The triangle is simply passed to the Geometry shader and thus the fur is rendered normally.
Sneaky trick #2: it can have the Hull shader ask the Tessellator split each triangle into 4 when taking a photo. However, the Domain shader then ignores the Tessellator, and instead just makes 4 copies of the original, un-tessellated triangle. The fur can thus be rendered at up to 4x resolution, but only when needed.
Fur Passes - "furFadeIn"
At close range, the variable "furFadeIn" is at 1.0, and it gradually goes to 0.0 at the furthest range. The 3 passes react to "furFadeIn" as follows:
"furFadeIn" | Skin | Undercoat | Fur |
---|---|---|---|
1.0 | At 0% height | Is bottom hair layer (ie. see-through) | Renders all hair layers, except bottom |
1.001 -> 0.199 | At 0% height | Is opaque and expands towards 35% height as "furFadeIn" increases. | Layer density gradually decreases |
0.2 -> 0.001 | At 35% height | Is translucent and expands towards 50% height as "furFadeIn" increases. | Layer density gradually decreases and gets squeezed together |
0.0 | At 35% height | Is translucent and at 50% height | Is not rendered |