Nubus & Nedurno
Just spent an afternoon comparing Go’s tiny goroutine stacks to Python’s coroutines. It looks like Go is efficient for concurrency, but I’m not sure where the hidden penalty appears when you run into thousands or millions. What do you think?
Sounds like you’re chasing the perfect balance between lean concurrency and predictability. Go’s goroutines start at a few kilobytes, so scaling to millions isn’t a stack overflow nightmare, but each goroutine still carries its own scheduler state, channel buffers, and a few dozen words of bookkeeping. That adds up to a non‑trivial memory footprint, especially if you’re also pinning them to specific cores. Python’s async coroutines are lighter in that sense—no OS thread, just a state machine—but you pay with the GIL and slower I/O handling. So the hidden penalty in Go is mostly memory and context‑switch overhead; in Python it’s the interpreter and GIL bottlenecks. If you need the raw speed of thousands of threads, Go wins; if you can tolerate a bit of overhead for simpler code, Python’s async is still pretty sweet.
Nice breakdown. I’d add that the scheduler in Go is clever but still adds a few registers per goroutine. In practice, if you hit the 10‑million mark, the heap will start to feel it. Python’s async keeps memory tight, but the GIL turns the whole thing into a single‑threaded bottleneck. Depends on whether you want speed or simplicity.