-
Notifications
You must be signed in to change notification settings - Fork 479
Prevent admitting inactive workloads #7913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Skipping CI for Draft Pull Request. |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
|
/test all |
| if err = h.r.queues.AddOrUpdateWorkload(wlCopy); err != nil { | ||
| log.V(2).Info("ignored an error for now", "error", err) | ||
| if workload.IsActive(wlCopy) && !workload.HasQuotaReservation(wlCopy) { | ||
| if err = h.r.queues.AddOrUpdateWorkload(wlCopy); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I see AddOrUpdateWorkload calls AddOrUpdateWorkloadWithoutLock under the hood and you added the checks to that function as well.
Is it purely "just in case"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you highlighted is the actual check.
What I added deeper in AddOrUpdateWorkloadWithoutLock is an assertion.
The story is that there are 6 paths to reach AddOrUpdateWorkloadWithoutLock altogether, and the other 5 were already covered with this check. (BTW not easily migratable down the call stack). So I added the 6-th check on the same level, to be "little invasive" into the code structure.
However, I also chose to add an assertion deeper down, so that tests can catch any change which adds a 7-th path but forgets to pre-check.
(Discussed in more detail in the issue, #6711).
|
/retest |
|
Thanks 👍 |
|
@mimowo: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
LGTM label has been added. Git tree hash: 643fc0ce1864fe42115770cb12246fb93f29182e
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mimowo, olekzabl The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@mimowo: new pull request created: #7918 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@mimowo: #7913 failed to apply on top of branch "release-0.13": In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
…7939) * Prevent admitting inactive workloads * Adress lint finding & review comment * Yet another linter fix * Manually fix imports
…7944) * Prevent admitting inactive workloads * Adress lint finding & review comment * Yet another linter fix * Manually fix imports * Remove an empty line
* Initital implementation. * added assumed workloads verification to tests * [Cleanup] Restrict controller-manager access of ClusterProfiles to the Kueue namespace (kubernetes-sigs#7843) * Restrict access to ClusterProfiles to those existing in kueue's namespace * Use fake client instead of stub * Log when skipping ClusterProfile * feat: display flavor assignment attempts in events (kubernetes-sigs#7646) * Prevent admitting inactive workloads (kubernetes-sigs#7913) * Prevent admitting inactive workloads * Adress lint finding & review comment * Yet another linter fix * Add integration tests for excludeResourcePrefixes scheduler configuration (kubernetes-sigs#7492) This adds comprehensive integration tests for the excludeResourcePrefixes feature in a dedicated test directory following the pattern of other scheduler tests (fairsharing, podsready). The tests verify: - Basic resource exclusion from quota calculations - Multiple excluded resource prefixes - Quota enforcement for non-excluded resources - Exact prefix matching (not substring matching) All tests create Workload objects directly and use the scheduler test utilities for robust assertions. * Add performance tests for v1beta2 TopologyAssignment encoding (kubernetes-sigs#7821) * Add performance tests for v1beta2 TAS assignment * Various fixes * Address linter findings * Add some logging to debug Prow timeout * Round node limits to thousands (to speed up the test) * Remove logging; speed up the test * Change node pool counts * TAS: Respect `requiredDuringSchedulingIgnoredDuringExecution` affinity (kubernetes-sigs#7899) * Respect `requiredDuringSchedulingIgnoredDuringExecution` in TAS * Remove redundant information from `leafDomain` * Reduce the amount of `nil` checks when handling affinity in TAS * Adjust comment describing node selector check * Add unit test for TAS required affinity * Make label keys less ambiguous in test * Reword the TAS affinity check test cases * Add wait for ClusterQueue Active status in visibility test (kubernetes-sigs#7922) * Unauthorized error when running renovate (kubernetes-sigs#7926) * Fix the availability check after rolling restart (kubernetes-sigs#7925) * test: fix pod groups E2E test flake by checking workload admission before gates (kubernetes-sigs#7917) Signed-off-by: Sohan Kunkerkar <[email protected]> * Use main context for CertBootstrap (kubernetes-sigs#7930) * Replace kueue-populator references for release (kubernetes-sigs#7932) * add replaces to prepare-release-branch target * prepare READMEs * make DES_CHART_DIR overridable * cleanup * cleanup * cleanup * add makefile comment * replace registry in readmes * Revert "replace registry in readmes" This reverts commit 0b066cc. * Fix the flaky test for resources (kubernetes-sigs#7941) * Cleanup. * Applied review comments. * Applied review comments 2. * Restore log. * Using wrong qKey source in heads fix. * Fixed pernding workloads reporting bug. * cq test TestFIFOClusterQueue cleanup. * Appwraper test cleanup. --------- Signed-off-by: Sohan Kunkerkar <[email protected]> Co-authored-by: Michał Szadkowski <[email protected]> Co-authored-by: Mykyta Derhunov <[email protected]> Co-authored-by: Olek Zabłocki <[email protected]> Co-authored-by: Kevin Hannon <[email protected]> Co-authored-by: Karol Szuster <[email protected]> Co-authored-by: Irving Mondragón <[email protected]> Co-authored-by: vladikkuzn <[email protected]> Co-authored-by: Michał Woźniak <[email protected]> Co-authored-by: Sohan Kunkerkar <[email protected]> Co-authored-by: Jakub Skiba <[email protected]>
What type of PR is this?
/kind bug
What this PR does / why we need it:
It fixes #6711 - analogously as in the discontinued #6737 - but adding 2 more bits:
Which issue(s) this PR fixes:
Fixes #6711
Special notes for your reviewer:
Please see the discussion in #6711, starting from this comment.
Does this PR introduce a user-facing change?