TopologyAssignment v1beta2 #7544

olekzabl · 2025-11-05T15:46:37Z

What type of PR is this?

/kind feature
/kind api-change

What this PR does / why we need it:

This PR:

changes the format of TopologyAssignment in v1beta2, as proposed in Support larger workloads in TAS #7220 and planned in KEP KEP changes for v1beta2 TopologyAssignment #7419
implements a simple "topology assignment encoder", which:
- always uses just 1 slice
- uses universalPodCount & universalValue whenever possible
- when not using universalValue, always extracts the longest possible prefix & suffix.

The main motivation for this change is to support more nodes for a single workload.
For some real-life node naming schemes which we considered:

the current limit (without this change) is ~25k nodes
the new limit (with this change) is ~60k nodes
this change also leaves a path for future enhancements, of two kinds:
- by changing the underlying encoding strategy (but not the format), so that multiple Slices are used, we might grow beyond 100k
- by exporting some Slices to separate CRD objects (a larger move, but based on this PR), we can practically get rid of any limit.

Which issue(s) this PR fixes:

Fixes #7220

Special notes for your reviewer:

Currently WIP. A summary of what's already done vs. postponed:
- API types
  - Comments on those
  - Kubebuilder (cross-)validation rules
- Conversion between old & new format
- Existing code builds
- Existing unit tests migrated to pass (minimal effort so far)
- Existing integration tests migrated to pass (minimal effort so far)
- Existing E2E tests migrated to pass (minimal effort so far)
- Test coverage for tas_assignment.go
  (99.1%, the only line not tested is this one - which is not testable but probably still good to keep)
- Test coverage for the conversion webhook
- Comments for the new implementation code (minimalistic, where non-trivial)
- Release note
And some optional points (I think they could be postponed to follow-up PRs, or dropped altogether - feedback is welcome):
- Ensuring that the existing tests are adjusted "nicely" (see # 4 below)
  - Current approach: A is good enough for this PR, C may be done as follow-up.
    I will open an issue for this once I hear voices preferring sth else than A.
- Tests for validation rules (likely integration tests)
  - Current approach: We want it but not in this PR.
    Tracked in Add validation tests for v1beta2 TopologyAssignment #7639.
- "Performance" tests, tracked in Add performance tests for v1beta2 TopologyAssignment encoding #7640:
  - Tests for limits of how many nodes can fit in 1 etcd entry, assuming most popular node naming schemes
  - Benchmarks of "encoding" performance for those naming schemes
The encoding always extracts the longest possible prefix & suffix.
We once considered an alternative to extract only up to a punctuation character (- or .).
For now, I chose against this, for the following reasons:
- This would degrade "compression efficiency" e.g. for AKS node naming scheme: {nodePoolId}-000023 etc.
  For a single node pool, we would keep the 0000 outside of the prefix, for no good reason.
- In my feeling, leaving 0000 outside of the prefix could also confuse the users.
- Last, the code is slightly simpler without this.
My general viewpoint is that the old format is better for reasoning about. The new format is only needed "at the surface" of Kueue, to be used in what we send to etcd (and then, obviously, what we read from it), to ensure scalability there.
Consequently, I stick to the old format in internal Kueue code wherever I could.
(To avoid depending on v1beta1, I introduced new internal type, tas.TopologyAssignment).
This had the benefit of reducing the size of this refactor.
However, my main motivation is permanent, so I intend this tas.TopologyAssignment to stay permanently.
As for how the tests were migrated, this was a quick attempt which we may choose to improve.
My main tactic so far was to replace
```
Expect(X).Should(BeComparableTo(kueue.TopologyAssignment{ ... }))
```
with
```
Expect(X).Should(BeComparableTo(tas.V1Beta2From(tas.TopologyAssignment{ ... })))
```
i.e. switch old to internal format, then just convert it to the new one (on the "expected" side).

This may be considered ugly, in 2 ways: (i) it makes all tests depend on the conversion logic, and (ii) it asserts a very specific encoding, one of many possible variants, of a TopologyAssignment.

I see 3 general approaches to this, and I'll be curious on feedback here
(where, besides direction feedback, we could also discuss if changes should happen in this PR or a follow-up one):

A. Keep this as is, it's good enough.

B. Replace the calls to tas.V1Beta2From() with their results, so that we explicitly say what we expect.
I don't like it, as it would make tests depend on the implementation details of internal -> v1beta2 conversion.

C. Switch to converting on the other side:
```
Expect(tas.InternalFrom(X)).Should(BeComparableTo(tas.TopologyAssignment{ ... }))
```
Here, the benefit is that we abstract from potential future changes of our encoding strategy.
We just assess what assignment is encoded, no matter how exactly.

Does this PR introduce a user-facing change?

v1beta2: The internal representation of TopologyAssignment (in WorkloadStatus) has been reorganized to allow using TAS for larger workloads. (More specifically, under the assumptions described in issue #7220, it allows to increase the maximal workload size from approx. 20k to approx. 60k nodes).

k8s-ci-robot · 2025-11-05T15:46:48Z

Hi @olekzabl. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

netlify · 2025-11-05T15:46:51Z

✅ Deploy Preview for kubernetes-sigs-kueue canceled.

Name	Link
🔨 Latest commit	`46b74af`
🔍 Latest deploy log	https://app.netlify.com/projects/kubernetes-sigs-kueue/deploys/69164580f0ffdc00088554ba

olekzabl · 2025-11-05T15:48:30Z

@mimowo @mwielgus @PBundyra FYI
(I guess you're busy these days, so I'm willing to wait - but any early feedback will be appreciated)

mimowo · 2025-11-05T16:02:49Z

/ok-to-test

mimowo · 2025-11-05T16:06:11Z

apis/kueue/v1beta1/workload_conversion.go

 	return autoConvert_v1beta1_WorkloadStatus_To_v1beta2_WorkloadStatus(in, out, s)
 }
+
+func Convert_v1beta1_TopologyAssignment_To_v1beta2_TopologyAssignment(in *TopologyAssignment, out *v1beta2.TopologyAssignment, s conversionapi.Scope) error {


Since the conversion is non-trivial it would be great to add a unit test in workload_conversion_test.go

To make diff initial you could draft some base tests as part of #7394, which we could extend in this PR.

Done. Specifically, I added:

a richer set of unit test cases in a dedicated file, tas_assigment_test.go

a smaller set of test cases in workload_conversion_test.go

the new conversion tests lgtm

olekzabl · 2025-11-06T00:11:14Z

/retest

olekzabl · 2025-11-06T00:30:01Z

/retest

olekzabl · 2025-11-06T09:03:13Z

/retest

olekzabl · 2025-11-06T09:18:42Z

/retest

olekzabl · 2025-11-06T09:34:51Z

/test

k8s-ci-robot · 2025-11-06T09:35:08Z

@olekzabl: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

/test pull-kueue-build-image-main

/test pull-kueue-test-e2e-certmanager-main

/test pull-kueue-test-e2e-customconfigs-main

/test pull-kueue-test-e2e-kueueviz-main

/test pull-kueue-test-e2e-main-1-31

/test pull-kueue-test-e2e-main-1-32

/test pull-kueue-test-e2e-main-1-33

/test pull-kueue-test-e2e-main-1-34

/test pull-kueue-test-e2e-multikueue-main

/test pull-kueue-test-e2e-tas-main

/test pull-kueue-test-integration-baseline-main

/test pull-kueue-test-integration-extended-main

/test pull-kueue-test-integration-multikueue-main

/test pull-kueue-test-scheduling-perf-main

/test pull-kueue-test-unit-main

/test pull-kueue-verify-main

Use /test all to run all jobs.

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

olekzabl · 2025-11-06T09:38:52Z

/retest

olekzabl · 2025-11-06T10:16:51Z

/retest

olekzabl · 2025-11-06T12:16:02Z

/retests

olekzabl · 2025-11-13T14:48:54Z

/retest

olekzabl · 2025-11-13T15:06:33Z

/retest

mimowo

LGTM, mostly nits, thank you for making the diff so easy to read thanks to the helper functions

One hesitation I have is still about handling the punctuation: "-" and ".". If we don't use punctuation now, then using it in the future to optimize performance may be breaking for some clusters on prem which don't use punctuation for node naming. OTOH, I'm not sure it makes sense to prematurely complicate the code - we will be able to evolve the code using feature gates anyway.

pkg/controller/tas/node_failure_controller.go

pkg/util/tas/tas_assignment.go

pkg/util/tas/tas_assignment_test.go

pkg/util/tas/tas_assignment.go

pkg/util/tas/tas_assignment_test.go

olekzabl · 2025-11-13T21:05:08Z

One hesitation I have is still about handling the punctuation: "-" and ".".

As mentioned in the description of this PR, I prefer not to deal with punctuation at this point - for 2 reasons:

as you mentioned, for not complicating the code

but also

for not discriminating against naming schemes like X-00000Y.

Then, to address your specific point:

If we don't use punctuation now, then using it in the future to optimize performance may be breaking for some clusters on prem which don't use punctuation for node naming.

Once we enter the level of sophistication where punctuation would be used to aid performance - we can also think of some heuristic like "drop sticking to punctuation for those inputs where it's very rare". WDYT?

mimowo · 2025-11-14T08:54:10Z

Once we enter the level of sophistication where punctuation would be used to aid performance - we can also think of some heuristic like "drop sticking to punctuation for those inputs where it's very rare". WDYT?

Maybe, or we just use the new algorithm if there are any punctuations, and they are at consistent positions for all node names.

I think if we, in the future, rollout a new algorithm we will need to use a feature-gate anyway. Then, we can enable that new feature gate as default, and basically wait for 2-3 releases to see if we get feedback from users. These users will be able to disable the feature gate if needed. If we don't hear feedback we will be free to roll the new algorithm.

mimowo · 2025-11-14T08:59:17Z

Thank you 👍
/lgtm
/approve

k8s-ci-robot · 2025-11-14T08:59:25Z

LGTM label has been added.

Git tree hash: fc9998269a0307f6383f42742513ebfae64a227b

kshalot · 2025-11-14T08:57:50Z

apis/kueue/v1beta2/workload_types.go

+	// It must be either nil pointer or a non-empty string.
+	// +optional
+	// +kubebuilder:validation:MaxLength=63


Nit:

Wording.

Suggested change

// It must be either nil pointer or a non-empty string.

// +optional

// +kubebuilder:validation:MaxLength=63

// It must be either a nil pointer or a non-empty string.

// +optional

// +kubebuilder:validation:MaxLength=63

Also I suppose the wording could be changed to sth like:

Suggested change

// It must be either nil pointer or a non-empty string.

// +optional

// +kubebuilder:validation:MaxLength=63

// If the values do not share a common prefix, it must be nil.

// +optional

// +kubebuilder:validation:MaxLength=63

You can also say ...it must be nil rather than an empty string.

If the string has to be non-empty, can we add a MinLength=1 validation? Or does it somehow not play nicely with the nil value?

kshalot · 2025-11-14T09:22:01Z

pkg/util/tas/tas_assignment.go

+) {
+	var prefix, suffix string
+	var maxLen, minLen, count int
+	for s := range inputProvider() {


Is it worth to add an early exit if both the prefix and suffix become empty inside this loop? In infrastructures using a naming scheme where the nodes don't usually share a prefix/suffix (whatever they may be) it would save some iterations.

kshalot · 2025-11-14T09:24:10Z

pkg/util/tas/tas_assignment.go

+		if count == 1 {
+			prefix = s
+			suffix = s
+			maxLen = len(s)
+			minLen = len(s)


You can consider moving the initialization outside of the loop by converting the iterator to a "pull operator". But this would change the code in other ways as well.

k8s-ci-robot · 2025-11-14T09:25:56Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kshalot, mimowo, olekzabl

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [mimowo]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

* Edited v1beta2 types * (empty commit by mistake) * TAS assignment v1beta2 * Ran 'make manifest'; some fixes * Fix integration tests * Another fix of integration tests * Updated the API comment on v1beta2 TopologyAssignment * Fixes after sync * Adding generated files; Prow tests seem to fail without them * Omit empty slices in JSON * Regenerating a file * Fix comment typo * Fixed duplicated import * Add licensing header in tas_assignment.go * Formatting fixes * Fix imports arrangement * More fixes * Regenerated files * Ran `make update-helm` * Use ptr.Deref * Adding tas_assignment_test.go; fixing a bug :) * Reorganized unit tests - testing 3 functions with same test cases * Complete unit tests for tas_assignment.go * Fix test * Fix after sync * Fix after sync * Fix imports * Set MinLength=1 for prefix & suffix (following KEP discussion) * Apply data format changes suggested in KEP PR * Added unit tests for conversion webhook * Add comments on fuctions (where non-trivial) * Adjust after sync * Fix wrong JSON field name spec * Regenerated files * Another test update * Please a linter * Regenerated files * Fixes in magic comments * More fixes in magic comments * Addressing comments

k8s-ci-robot requested review from kannon92 and mbobrovskyi November 5, 2025 15:46

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 5, 2025

k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 5, 2025

olekzabl changed the title ~~TopologyAssignment v1beta2~~ WIP: TopologyAssignment v1beta2 Nov 5, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 5, 2025

mimowo mentioned this pull request Nov 5, 2025

v1beta2: introduce unit tests for the conversions #7394

Closed

mimowo reviewed Nov 5, 2025

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 5, 2025

olekzabl force-pushed the br18 branch from 1953f29 to 1821c64 Compare November 6, 2025 00:00

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 6, 2025

olekzabl added 3 commits November 13, 2025 13:21

Please a linter

438b09e

Regenerated files

b0cb878

Fixes in magic comments

8c69cb8

olekzabl force-pushed the br18 branch from f4c7dac to 8c69cb8 Compare November 13, 2025 14:48

More fixes in magic comments

75179cd

olekzabl marked this pull request as ready for review November 13, 2025 15:33

k8s-ci-robot requested a review from mimowo November 13, 2025 15:33

olekzabl requested a review from kshalot November 13, 2025 15:33

olekzabl changed the title ~~WIP: TopologyAssignment v1beta2~~ TopologyAssignment v1beta2 Nov 13, 2025

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 13, 2025

mimowo reviewed Nov 13, 2025

View reviewed changes

Addressing comments

46b74af

olekzabl requested a review from mimowo November 13, 2025 21:07

k8s-ci-robot assigned mimowo Nov 14, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 14, 2025

k8s-ci-robot merged commit 7126635 into kubernetes-sigs:main Nov 14, 2025
22 checks passed

kshalot approved these changes Nov 14, 2025

View reviewed changes

k8s-ci-robot added this to the v0.15 milestone Nov 14, 2025

olekzabl mentioned this pull request Nov 17, 2025

Rename Common{Prefix,Suffix} to {Prefix,Suffix} in TAS assignment v1beta2 #7697

Merged

This was referenced Nov 20, 2025

Kubebuilder cross-validation tests for TopologyAssignment v1beta2 #7794

Merged

Add performance tests for v1beta2 TopologyAssignment encoding #7821

Merged

TopologyAssignment v1beta2 #7544

TopologyAssignment v1beta2 #7544

Uh oh!

Conversation

olekzabl commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Uh oh!

k8s-ci-robot commented Nov 5, 2025

Uh oh!

netlify bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for kubernetes-sigs-kueue canceled.

Uh oh!

olekzabl commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mimowo commented Nov 5, 2025

Uh oh!

mimowo Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

olekzabl Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

mimowo Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

olekzabl commented Nov 6, 2025

Uh oh!

olekzabl commented Nov 6, 2025

Uh oh!

olekzabl commented Nov 6, 2025

Uh oh!

olekzabl commented Nov 6, 2025

Uh oh!

olekzabl commented Nov 6, 2025

Uh oh!

k8s-ci-robot commented Nov 6, 2025

Uh oh!

olekzabl commented Nov 6, 2025

Uh oh!

olekzabl commented Nov 6, 2025

Uh oh!

olekzabl commented Nov 6, 2025

Uh oh!

olekzabl commented Nov 13, 2025

Uh oh!

olekzabl commented Nov 13, 2025

Uh oh!

mimowo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

olekzabl commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mimowo commented Nov 14, 2025

Uh oh!

mimowo commented Nov 14, 2025

Uh oh!

k8s-ci-robot commented Nov 14, 2025

Uh oh!

Uh oh!

kshalot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

kshalot Nov 14, 2025

Choose a reason for hiding this comment

olekzabl commented Nov 5, 2025 •

edited

Loading

netlify bot commented Nov 5, 2025 •

edited

Loading

olekzabl commented Nov 5, 2025 •

edited

Loading

olekzabl commented Nov 13, 2025 •

edited

Loading