Skip to content

Conversation

@serathius
Copy link
Member

@serathius serathius commented Jul 4, 2025

Fixes #20221

/cc @ahrtr @fuweid

The bug is in resync setting minRev. It assumes that watcher can only be behind, so after synchronizing it sets revision to current. However restore moves all watcher to unsynced, even those that are in the future. Those watchers don't get any events from resync, but their minRev is overriden.

@k8s-ci-robot k8s-ci-robot requested review from ahrtr and fuweid July 4, 2025 12:57
@serathius serathius force-pushed the watch-restore-future branch 2 times, most recently from 0536c4d to a3ac879 Compare July 4, 2025 13:02
@serathius serathius force-pushed the watch-restore-future branch from a3ac879 to cf0369e Compare July 4, 2025 13:09
@serathius serathius mentioned this pull request Jul 4, 2025
2 tasks
// Next retry of syncWatchers would try to resend the compacted watch response to w.ch
continue
}
w.minRev = curRev + 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only happens when watching a future revision?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to add such protection for all places which update w.minRev.

func (w *watcher) updateMinRev(rev) {
    if rev > w.minRev {
        w.minRev = rev
   }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying this everywhere might hide other kinds of bugs. I would prefer to cover the code with tests and then rewrite it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@codecov
Copy link

codecov bot commented Jul 4, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.95%. Comparing base (866bc07) to head (cf0369e).
Report is 4 commits behind head on main.

Additional details and impacted files
Files with missing lines Coverage Δ
server/storage/mvcc/watchable_store.go 93.73% <100.00%> (-0.34%) ⬇️

... and 45 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #20281      +/-   ##
==========================================
+ Coverage   68.96%   69.95%   +0.98%     
==========================================
  Files         415      399      -16     
  Lines       34595    34027     -568     
==========================================
- Hits        23860    23804      -56     
+ Misses       9340     8838     -502     
+ Partials     1395     1385      -10     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 866bc07...cf0369e. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for the great work!

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, serathius

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@serathius serathius merged commit 9fb8c36 into etcd-io:main Jul 4, 2025
34 checks passed
Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Good catch!

@ivanvc
Copy link
Member

ivanvc commented Jul 6, 2025

/cherry-pick release-3.6

@k8s-infra-cherrypick-robot

@ivanvc: new pull request created: #20286

In response to this:

/cherry-pick release-3.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@serathius
Copy link
Member Author

/cherry-pick release-3.5
/cherry-pick release-3.4

@k8s-infra-cherrypick-robot

@serathius: #20281 failed to apply on top of branch "release-3.4":

Applying: Avoid lowering revision of watchers in the future after restore
Using index info to reconstruct a base tree...
A	server/storage/mvcc/watchable_store.go
A	server/storage/mvcc/watchable_store_test.go
Falling back to patching base and 3-way merge...
Auto-merging mvcc/watchable_store_test.go
CONFLICT (content): Merge conflict in mvcc/watchable_store_test.go
Auto-merging mvcc/watchable_store.go
CONFLICT (content): Merge conflict in mvcc/watchable_store.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 Avoid lowering revision of watchers in the future after restore

In response to this:

/cherry-pick release-3.5
/cherry-pick release-3.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@serathius
Copy link
Member Author

/cherry-pick release-3.5

@k8s-infra-cherrypick-robot

@serathius: #20281 failed to apply on top of branch "release-3.5":

Applying: Avoid lowering revision of watchers in the future after restore
Using index info to reconstruct a base tree...
A	server/storage/mvcc/watchable_store.go
A	server/storage/mvcc/watchable_store_test.go
Falling back to patching base and 3-way merge...
Auto-merging server/mvcc/watchable_store_test.go
CONFLICT (content): Merge conflict in server/mvcc/watchable_store_test.go
Auto-merging server/mvcc/watchable_store.go
CONFLICT (content): Merge conflict in server/mvcc/watchable_store.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 Avoid lowering revision of watchers in the future after restore

In response to this:

/cherry-pick release-3.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@serathius
Copy link
Member Author

Manual backport #20290

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Watch on future revision might receive old events or notifications, happens when client connects to multiple members.

6 participants