-
Notifications
You must be signed in to change notification settings - Fork 6.6k
feat(controller): Permit using newer revision when retrying failed sync (#11494) #23038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
❌ Preview Environment deleted from BunnyshellAvailable commands (reply to this comment):
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #23038 +/- ##
==========================================
+ Coverage 60.32% 60.36% +0.03%
==========================================
Files 350 350
Lines 60032 60061 +29
==========================================
+ Hits 36217 36254 +37
+ Misses 20901 20895 -6
+ Partials 2914 2912 -2 ☔ View full report in Codecov by Sentry. |
|
@jannfis, I understand you had a PoC doing something similar, I would appreciate your review. Thanks! |
agaudreault
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in contributor meeting, only auto-sync should have this behaviour. #11494 (comment)
|
If a sync fails, is it possible to differentiate between a retryable error (eg: nodes, resources, api server not available) and a non retryable error (eg: a manifest is not according to the schema, image does not exist etc). And the sync operation will retry only for a retryable error and for the other scenario, sync would just fail as it does not make sense to keep retrying if the error is say manifest not matching the schema. |
Interesting point. I have few observations:
Agreed that manifests not matching the schema make no sense retrying with same commit. BUT it is exactly the thing I would like Argo CD to retry with new commit ASAP. |
|
On the contributors meeting it was agreed that updating the commit sha (as originally implemented here) can be confusing for users as it does not indicate the fact a sync operation with HEAD~1 has failed. There was a consensus it is better to find out a way to get the current retry fail clearly, and let/force the new one to kick in. |
b33513d to
eb10df5
Compare
|
Thanks for the pointers, @agaudreault. I have changed the approach to let the current attempt fail, and let the new one to kick in, but I am struggling to get things to work. The phase never gets to Succeeded in the very last step. Also, the UI seems to be in an odd state. The "SYNC STATUS" reports success on last (fixed) commit, but "LAST SYNC" reports fail on an earlier commit.
Hitting [Sync] in E2E UI gets the app go completely green, but for some reason, it does not happen on its own. |
0e6c345 to
3259682
Compare
c267687 to
e7a8caa
Compare
Signed-off-by: Zadkiel AHARONIAN <[email protected]> Signed-off-by: Oliver Gondža <[email protected]>
Signed-off-by: Zadkiel AHARONIAN <[email protected]> Signed-off-by: Oliver Gondža <[email protected]>
Signed-off-by: Zadkiel AHARONIAN <[email protected]> Signed-off-by: Oliver Gondža <[email protected]>
|
@agaudreault, I have updated the tests. Curious to hear your thoughts regarding the previous comment... |
These are the use cases / business requirements to test with this change. The UI sync behaviour should also populate the refresh option and correct sources. The rollback experience is different and should never try to refresh. The behaviour of the UI vs CLI for the same operation should be consistent with how retries are configured for each. To make sure the operation can be refreshed correctly, you must make sure that the sources are properly set in the operation object. |
Signed-off-by: Zadkiel AHARONIAN <[email protected]> Signed-off-by: Oliver Gondža <[email protected]>
Signed-off-by: Zadkiel AHARONIAN <[email protected]> Signed-off-by: Oliver Gondža <[email protected]>
Signed-off-by: Zadkiel AHARONIAN <[email protected]> Signed-off-by: Oliver Gondža <[email protected]>
2ba691b to
dd2e7ca
Compare
Signed-off-by: Oliver Gondža <[email protected]>
Signed-off-by: Oliver Gondža <[email protected]>
|
@agaudreault, I have added the tests (and prod code fix) for explicit CLI sync. Though, the (CLI) rollback seems to be a no-brainer: |
eb627ec to
8317692
Compare
Signed-off-by: Oliver Gondža <[email protected]>
…494-agaudreault Signed-off-by: Alexandre Gaudreault <[email protected]>
Signed-off-by: Alexandre Gaudreault <[email protected]>
Signed-off-by: Alexandre Gaudreault <[email protected]>
Signed-off-by: Oliver Gondža <[email protected]>
Signed-off-by: Alexandre Gaudreault <[email protected]>
Signed-off-by: Alexandre Gaudreault <[email protected]>
Signed-off-by: Alexandre Gaudreault <[email protected]>
Signed-off-by: Alexandre Gaudreault <[email protected]>
…494-agaudreault Signed-off-by: Alexandre Gaudreault <[email protected]>
cover sync/rollback scenarios
agaudreault
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Thanks, @agaudreault. Your help was priceless! |
…nc (argoproj#11494) (argoproj#23038) Signed-off-by: Zadkiel AHARONIAN <[email protected]> Signed-off-by: Oliver Gondža <[email protected]> Signed-off-by: Alexandre Gaudreault <[email protected]> Co-authored-by: Zadkiel AHARONIAN <[email protected]> Co-authored-by: Alexandre Gaudreault <[email protected]>

When sync fails due to incorrect manifest declarations, this permits fixes to be deployed via future commits.
This is controlled by a new boolean Application CRD field
syncPolicy.retry.refreshor via the--sync-retry-refreshflag.Closes #11494
Related to #6055
Discussed at https://www.youtube.com/watch?v=baIX9Bk6f5w&t=1173s
Kudos to @aslafy-z and @Sayrus for doing most of the heavy lifting here (#15603).
Checklist: