-
Notifications
You must be signed in to change notification settings - Fork 479
[Fix][flaky test] Kueue visibility server when There are pending workloads due to capacity maxed by the admitted job Should allow fetching information about pending workloads in ClusterQueue (v1beta1) #7922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix][flaky test] Kueue visibility server when There are pending workloads due to capacity maxed by the admitted job Should allow fetching information about pending workloads in ClusterQueue (v1beta1) #7922
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
/cc @mbobrovskyi @mimowo |
|
/lgtm |
|
@mimowo: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
LGTM label has been added. Git tree hash: 376b0fe9eaae9d38829ba9c287476374eb1d6868
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: IrvingMg, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
This may not cherry-pick well since its specific for v1beta1 api. |
|
@mimowo: new pull request created: #7923 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@mimowo: new pull request created: #7924 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
* Initital implementation. * added assumed workloads verification to tests * [Cleanup] Restrict controller-manager access of ClusterProfiles to the Kueue namespace (kubernetes-sigs#7843) * Restrict access to ClusterProfiles to those existing in kueue's namespace * Use fake client instead of stub * Log when skipping ClusterProfile * feat: display flavor assignment attempts in events (kubernetes-sigs#7646) * Prevent admitting inactive workloads (kubernetes-sigs#7913) * Prevent admitting inactive workloads * Adress lint finding & review comment * Yet another linter fix * Add integration tests for excludeResourcePrefixes scheduler configuration (kubernetes-sigs#7492) This adds comprehensive integration tests for the excludeResourcePrefixes feature in a dedicated test directory following the pattern of other scheduler tests (fairsharing, podsready). The tests verify: - Basic resource exclusion from quota calculations - Multiple excluded resource prefixes - Quota enforcement for non-excluded resources - Exact prefix matching (not substring matching) All tests create Workload objects directly and use the scheduler test utilities for robust assertions. * Add performance tests for v1beta2 TopologyAssignment encoding (kubernetes-sigs#7821) * Add performance tests for v1beta2 TAS assignment * Various fixes * Address linter findings * Add some logging to debug Prow timeout * Round node limits to thousands (to speed up the test) * Remove logging; speed up the test * Change node pool counts * TAS: Respect `requiredDuringSchedulingIgnoredDuringExecution` affinity (kubernetes-sigs#7899) * Respect `requiredDuringSchedulingIgnoredDuringExecution` in TAS * Remove redundant information from `leafDomain` * Reduce the amount of `nil` checks when handling affinity in TAS * Adjust comment describing node selector check * Add unit test for TAS required affinity * Make label keys less ambiguous in test * Reword the TAS affinity check test cases * Add wait for ClusterQueue Active status in visibility test (kubernetes-sigs#7922) * Unauthorized error when running renovate (kubernetes-sigs#7926) * Fix the availability check after rolling restart (kubernetes-sigs#7925) * test: fix pod groups E2E test flake by checking workload admission before gates (kubernetes-sigs#7917) Signed-off-by: Sohan Kunkerkar <[email protected]> * Use main context for CertBootstrap (kubernetes-sigs#7930) * Replace kueue-populator references for release (kubernetes-sigs#7932) * add replaces to prepare-release-branch target * prepare READMEs * make DES_CHART_DIR overridable * cleanup * cleanup * cleanup * add makefile comment * replace registry in readmes * Revert "replace registry in readmes" This reverts commit 0b066cc. * Fix the flaky test for resources (kubernetes-sigs#7941) * Cleanup. * Applied review comments. * Applied review comments 2. * Restore log. * Using wrong qKey source in heads fix. * Fixed pernding workloads reporting bug. * cq test TestFIFOClusterQueue cleanup. * Appwraper test cleanup. --------- Signed-off-by: Sohan Kunkerkar <[email protected]> Co-authored-by: Michał Szadkowski <[email protected]> Co-authored-by: Mykyta Derhunov <[email protected]> Co-authored-by: Olek Zabłocki <[email protected]> Co-authored-by: Kevin Hannon <[email protected]> Co-authored-by: Karol Szuster <[email protected]> Co-authored-by: Irving Mondragón <[email protected]> Co-authored-by: vladikkuzn <[email protected]> Co-authored-by: Michał Woźniak <[email protected]> Co-authored-by: Sohan Kunkerkar <[email protected]> Co-authored-by: Jakub Skiba <[email protected]>
What type of PR is this?
/kind bug
What this PR does / why we need it:
I wasn’t able to reproduce the issue locally, even after running over 300 repetitions. However, it seems that waiting for the ClusterQueue to become active is missing from this test compared to the others, and that would explain the error message.
Which issue(s) this PR fixes:
Fixes #7909
Special notes for your reviewer:
Does this PR introduce a user-facing change?