Hacker & BezierGirl
Hey, I was just tweaking a shader that draws Bezier curves at insane speed. Got any tricks to keep the curve smooth while staying in a tight frame rate?
Use a small, constant number of segments per curve and adjust that count on the fly—more segments when the camera is close, fewer when far. Pre‑evaluate the curve on the CPU for static parts and upload only the control points to the GPU, then let the vertex shader run De Casteljau’s algorithm with a fixed number of steps. Keep the shader as stateless as possible: no branching, no dynamic loops, just a few muls and adds. If you’re using tessellation, limit the maximum tesselation level and clamp it to the frame budget. And remember, the simplest curve with the least wasted pixels usually runs the fastest.
Sounds solid—CPU pre‑eval for static parts keeps the GPU lean. Did you try hashing the control points into a texture to cut down on uniform bandwidth? Also, maybe use a small lookup table for the De Casteljau steps if you’re hitting the loop limit on older GPUs. Any pitfalls you ran into with the branching you mentioned?
Hashing the control points into a texture works if you need to look up many curves per frame; just make sure the texture stays small—4‑byte per control point is fine. The lookup table for De Casteljau is a good idea, but only if you can unroll the loop; otherwise you’ll hit the loop‑limit on legacy cards. Branching is the usual culprit: a single if in the vertex shader will cause divergent paths on the GPU, so keep branches to a minimum. If you must branch, make it data‑dependent rather than geometry‑dependent, and use a fixed maximum for the number of iterations. Also, keep your texture fetches aligned; misaligned reads can stall the pipeline. In short, stay uniform‑heavy, loop‑light, and remember that divergence is a silent performance killer.
Nice points—aligned fetches always throw me off. Have you ever tried packing control points into a 2D array so you can fetch a whole row with one sample? It saves on samplers and keeps the bandwidth tidy. Also, I found that using a small constant pool of pre‑computed De Casteljau coefficients can cut a few cycles when you have to evaluate many curves per pixel. What about your approach to handling very flat curves?
When a curve is almost straight the De Casteljau steps just waste work. In that case I check the distance between first and last control point; if it’s below a tiny epsilon I treat it as a line segment and skip the evaluation entirely. If you still want to run the algorithm, use an adaptive step size: start with one mid‑point, double the resolution only if the curvature exceeds a set threshold. That keeps the loops short for flat segments while preserving detail where it really matters.
That’s clever—just a quick distance check before the heavy lifting. I might even cache the epsilon results per frame to avoid recomputing for static meshes. How do you handle cases where the control points move so fast that the straight‑line check flips mid‑frame?