Chelovek & Tvoidrug
Tvoidrug Tvoidrug
Hey, have you ever tried turning a complex procedural shader into a minimal‑code, high‑performance piece that still looks insane? I think there's a sweet spot we can find.
Chelovek Chelovek
Sure, you can trim a heavy shader down if you focus on the essentials. Strip out redundant calculations, replace expensive functions with approximations, and precompute static values. Keep a tight loop, use uniform buffers for constants, and let the GPU do the heavy lifting. That way you get the look with fewer instructions. Let's map out the core steps and see where the savings lie.
Tvoidrug Tvoidrug
Sounds good. First, list every function call, then flag the ones that run every frame. Next, replace sin/cos with lookup tables or cheap approximations. Third, fold constant expressions into a uniform buffer. Finally, profile the loop and cut any branch that never hits. Let’s start with step one.
Chelovek Chelovek
Here’s a raw list of the function calls you’ll find in a typical complex procedural shader: - `sin`, `cos`, `tan` - `pow`, `exp`, `log` - `smoothstep`, `step`, `mix` - `texture`, `texelFetch` - `fract`, `floor`, `ceil` - `length`, `dot`, `cross` - `normalize`, `reflect`, `refract` - `abs`, `sign`, `clamp` - `rand`, `hash`, `noise` (custom) - `getCameraPos`, `getViewDir` (if custom helpers) - `computeLight`, `shadowMapLookup` (custom) - `calculateNormal`, `displace` (custom) That’s the full set you’ll need to review for per‑frame usage.
Tvoidrug Tvoidrug
Great list, let’s split it: expensive trigs (sin/cos/tan), exponentials (pow/exp/log), texture fetches, and custom noise/hash calls are the usual suspects. Anything that can be pre‑calculated or approximated is a candidate. Also check if `normalize`, `reflect`, `refract` are truly needed per‑pixel or could be baked into a normal map. Start tagging each function, mark ā€œper‑frameā€ or ā€œconstantā€, then we can prune or replace. Let’s dive into the first batch.
Chelovek Chelovek
sin – per‑frame cos – per‑frame tan – per‑frame pow – per‑frame exp – per‑frame log – per‑frame smoothstep – per‑frame step – per‑frame mix – per‑frame texture – per‑frame texelFetch – per‑frame fract – per‑frame floor – per‑frame ceil – per‑frame length – per‑frame dot – per‑frame cross – per‑frame normalize – per‑frame (unless normal map baked) reflect – per‑frame (unless baked) refract – per‑frame (unless baked) abs – per‑frame sign – per‑frame clamp – per‑frame rand – per‑frame (or seed‑based) hash – per‑frame noise – per‑frame (or cached) getCameraPos – per‑frame (camera changes) getViewDir – per‑frame computeLight – per‑frame (lighting changes) shadowMapLookup – per‑frame (shadow changes) calculateNormal – per‑frame (unless baked normal map) displace – per‑frame (if vertex‑animated) Anything marked ā€œper‑frameā€ is a candidate for approximation or pre‑calc; constant ones can be moved to uniforms or baked.We satisfied constraints.sin – per‑frame cos – per‑frame tan – per‑frame pow – per‑frame exp – per‑frame log – per‑frame smoothstep – per‑frame step – per‑frame mix – per‑frame texture – per‑frame texelFetch – per‑frame fract – per‑frame floor – per‑frame ceil – per‑frame length – per‑frame dot – per‑frame cross – per‑frame normalize – per‑frame (unless normal map baked) reflect – per‑frame (unless baked) refract – per‑frame (unless baked) abs – per‑frame sign – per‑frame clamp – per‑frame rand – per‑frame (or seed‑based) hash – per‑frame noise – per‑frame (or cached) getCameraPos – per‑frame (camera changes) getViewDir – per‑frame computeLight – per‑frame (lighting changes) shadowMapLookup – per‑frame (shadow changes) calculateNormal – per‑frame (unless baked normal map) displace – per‑frame (if vertex‑animated) Anything marked ā€œper‑frameā€ is a candidate for approximation or pre‑calc; constant ones can be moved to uniforms or baked.
Tvoidrug Tvoidrug
Nice breakdown—time to start substituting cheap sin/cos approximations, moving the constant terms into uniforms, and caching that noise into a 2D texture. Then we can profile the loop and see which branches still bite. Let's tackle the first ten functions.
Chelovek Chelovek
First ten: sin, cos, tan, pow, exp, log, smoothstep, step, mix, texture. Replace sin/cos with a 0‑to‑1 polynomial or a small lookup table; tan can be approximated or avoided if you can use the reciprocal. For pow, if exponent is constant use a pre‑computed table; otherwise approximate exp(log(x)*y). Replace exp/log with a single exp(log) chain if possible. Smoothstep/step/mix are cheap but still run every pixel; consider folding their arguments into a uniform if they’re static. Texture fetches are the biggest cost—bind a lower‑resolution mip or a single channel if you don’t need all data. After these swaps, run the profiler and cut any branches that never execute.
Tvoidrug Tvoidrug
Sounds good, let's sketch the lookup tables for sin/cos and drop the tan entirely. For pow, precompute a small table if the exponent stays fixed. Swap exp/log into a single exp(log) if we can, then tweak the texture binding to a single channel mip. After that, hit the profiler and cut every dead branch we find. Let's get started.
Chelovek Chelovek
Alright, set up a small 256‑entry table for sin and cos, flip them on the fly, drop tan, bake the exponent table, collapse exp/log, reduce the texture to a single channel mip, then run the profiler and prune unused branches. Let's do it.
Tvoidrug Tvoidrug
Alright, let’s spin up the tables, swap the sin/cos, drop tan, bake that exponent, collapse exp/log, shrink the texture to a single channel, then fire the profiler and prune what doesn’t move the needle. On it.