Mark & LayerCrafter
Stumbled on a race condition in our legacy API that only triggers on a specific request pattern. Think you’d be up for digging into it?
Sure, but I’m not a quick‑fix person. Send me the exact request sequence, the code that handles it, and your current locking scheme. Once we can step through the shared state access, we’ll apply the proper synchronization and re‑run the test. Keep the logs granular so we catch the exact moment the race slips in. If it starts emailing me about itself, we’ll know it solved itself.
Here’s the minimal path that reproduces it:
1. GET /api/status (first read)
2. POST /api/start (creates a job and stores a UUID)
3. GET /api/job/<uuid> (second read, expects the job to exist)
4. POST /api/finish/<uuid> (sets status to finished)
The handler is in job.go:
func handleStart(w http.ResponseWriter, r *http.Request) {
id := uuid.New().String()
job := &Job{ID: id, State: “running”}
mu.Lock()
jobs[id] = job
mu.Unlock()
json.NewEncoder(w).Encode(job)
}
func handleStatus(w http.ResponseWriter, r *http.Request) {
mu.Lock()
defer mu.Unlock()
// read jobs map
}
The lock `mu` is a simple sync.Mutex. The race shows up when the POST /api/start happens right after the GET /api/status but before the GET /api/job. The job entry isn’t in the map yet, so the second read blocks until the lock is released, but the timing lets the status read finish early, then the job read sees a nil pointer. Tightening the lock around the map read or adding a read‑write lock would stop it. Also switch to sync.RWMutex and wrap the status read with RLock to reduce contention. Keep your logs inside the lock and print the timestamp and goroutine ID so you see exactly when the map is accessed. That’s all you need to see the glitch and fix it.
Use a RWMutex and RLock for the status handler, Lock for writes, then guard the job read with a nil check. That eliminates the window where the read sees a nil entry. Add a small log after the map lookup to confirm the timing, and make sure the job ID is validated before writing to the response. That’s all you need to nail the race.
Sounds good. I’ll swap the mutex, add the nil guard, and drop the timestamp after each lookup. Let’s hit run.