mirror of https://github.com/hashicorp/consul
Deadlock scenario: 1. Due to scheduling, the state runner sends one snapshot into snapCh and then attempts to send a second. The first send succeeds because the channel is buffered, but the second blocks. 2. Separately, Manager.Watch is called by the xDS server after getting a discovery request from Envoy. This function acquires the manager lock and then blocks on receiving the CurrentSnapshot from the state runner. 3. Separately, there is a Manager goroutine that reads the snapshots from the channel in step 1. These reads are done to notify proxy watchers, but they require holding the manager lock. This goroutine goes to acquire that lock, but can't because it is held by step 2. Now, the goroutine from step 3 is waiting on the one from step 2 to release the lock. The goroutine from step 2 won't release the lock until the goroutine in step 1 advances. But the goroutine in step 1 is waiting for the one in step 3. Deadlock. By making this send non-blocking step 1 above can proceed. The coalesce timer will be reset and a new valid snapshot will be delivered after it elapses or when one is requested by xDS.pull/9689/head
parent
cf9a14ab6a
commit
37190c0d0d
Loading…
Reference in new issue