Skip to content

Conversation

@ispeters
Copy link
Contributor

There's a race condition in spawn_future() between dropping a future and the completion of the spawned operation. It's possible for the spawned operation to record the "state" of the future as complete, have the dropping future observe that, which leads to deleting the shared state, whereupon the completing operation touches a deleted object when trying to set the event.

This diff adds a unit test that repros the race (it crashed without the fix), and fixes the problem with a spin-wait.

I don't really like this solution because the spin-wait seems bad, but we're seeing production crashes so I need something. We can reconsider the overall synchronization design once we've got the crashes fixed. Maybe we need a refcount on top of the state machine.

There's a race condition in `spawn_future()` between dropping a `future`
and the completion of the spawn operation.  It's possible for the
spawned operation to record the "state" of the future as complete, have
the dropping future observe that, which leads to deleting the shared
state, whereupon the completing operation touches a deleted object when
trying to set the event.

This diff adds a unit test that repros the race (it crashed without the
fix), and fixes the problem with a spin-wait.

I don't *really* like this solution because the spin-wait seems bad, but
we're seeing production crashes so I need *something*.  We can
reconsider the overall synchronization design once we've got the crashes
fixed.  Maybe we need a refcount on top of the state machine.
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 19, 2023
@ispeters ispeters requested a review from janondrusek May 19, 2023 00:33
@janondrusek janondrusek merged commit 83fc64b into main May 19, 2023
@ispeters ispeters deleted the fix_spawn_future_race branch May 19, 2023 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants