Finger & Bitok
Bitok Bitok
Hey, have you seen that new concurrency bug in the async I/O lib? It’s a real brain‑teaser—mind if we dissect it together?
Finger Finger
I’ve skimmed the report—looks like a classic race on the reference counter. If you send me the trace, I can map the interleaving in seconds. Just let me know the exact call sequence.
Bitok Bitok
Sure thing, here’s the exact call chain that’s been triggering the race: 1. Thread A calls `incrementRefCount()` on object X 2. Immediately after, Thread B calls `decrementRefCount()` on the same object X 3. Thread A then calls `performAsyncOperation()` which internally does `fetchData()` 4. Thread B triggers `cleanup()` that calls `decrementRefCount()` again 5. The async operation’s callback from Thread A executes, attempting `decrementRefCount()` one last time 6. Finally, a background watchdog thread checks `isObjectAlive()` on X That interleaving should expose the counter underflow. Let me know if that matches what you’re seeing.
Finger Finger
Sounds right. The race is between the two decrements and the async callback. The counter goes from 1 to 0, then gets decremented again, underflowing. Add a guard or switch to an atomic counter that throws on negative values. That’ll expose the fault before the watchdog hits it. Need a quick patch or a unit test to reproduce deterministically?
Bitok Bitok
Sure thing—here’s a quick guard you can drop in, plus a deterministic test harness. **Patch snippet (add to the ref counter logic):** ```cpp std::atomic<int> refCnt{0}; void incrementRefCount() { refCnt.fetch_add(1, std::memory_order_relaxed); } void decrementRefCount() { int old = refCnt.fetch_sub(1, std::memory_order_acq_rel); if (old <= 0) { // Defensive: we just went negative, so throw or log loudly std::cerr << "Ref counter underflow! Old=" << old << std::endl; throw std::runtime_error("Ref counter underflow"); } } ``` The atomic handles the race for you, and the explicit check catches the negative case before the watchdog ever sees it. **Deterministic unit test (using gtest or similar):** ```cpp TEST(RefCounter, UnderflowDetection) { std::promise<void> p1, p2; std::future<void> f1 = p1.get_future(); std::future<void> f2 = p2.get_future(); // Step 1: start with one reference incrementRefCount(); // Thread A: decrement std::thread a([&](){ decrementRefCount(); // count becomes 0 p1.set_value(); }); // Thread B: async callback that decrements again std::thread b([&](){ f1.wait(); // wait until A has decremented decrementRefCount(); // should trigger underflow guard }); // Wait for threads to finish a.join(); b.join(); } ``` Because we use `std::promise`/`future` to enforce the exact order, the test will always hit the underflow path. If the guard works, the test will fail with the runtime error; otherwise you’ll see a silent negative count. Let me know if you want the test wrapped in a CI job or an example of how to hook this into your existing watchdog. Also, if you’re curious, the atomic `fetch_sub` guarantees that even if two threads call it simultaneously, one will see `old == 1` and the other will see `old == 0`, so the guard reliably triggers only once. Happy hunting!
Finger Finger
Looks solid. The guard will catch the underflow before it slips through to the watchdog. The test forces the exact race, so you’ll see the runtime error every run. Good to drop this into CI—just make sure the exception handling in the production code logs or aborts appropriately. If you hit any hiccups with the promise sync or need a lighter weight harness, let me know.
Bitok Bitok
Sounds like a plan—just make sure the exception is caught at the top‑level so the process can either log and continue or abort cleanly. If you’d rather swap out the promise/​future combo for something lighter, a simple `std::condition_variable` with a flag is enough to order the threads. And hey, if the guard ever starts throwing on a non‑fatal path, I can help you roll a fallback that keeps the watchdog happy but still alerts you. Just ping me when you hit that snarl.
Finger Finger
Got it—will wrap the throw in a top‑level catch, log the fault, and decide whether to abort or keep going based on the severity flag. I’ll test the condition_variable version too; the promise is a bit heavy. If the guard starts surfacing on normal paths, I’ll bring in a non‑fatal flag and an alert callback. Keep me posted on the fallback design when you’re ready.