Skip to content

Higher-priority workloads are blocked while a “sticky” low-priority workload is pinned to the head of the queue #7301

@ichekrygin

Description

@ichekrygin

Related issues and PRs:

  • Root cause discussion: #6929
  • Sticky workload introduction: #7157

Summary

After the “sticky pending workload” fix introduced in #7157, higher-priority workloads may become blocked behind a lower-priority workload that is pinned to the head of the queue.

The intent of the sticky behavior was to prevent starvation by keeping the workload that initiated preemption at the queue head until its preemption sequence completes. However, this also prevents newly added, higher-priority workloads from being scheduled while the sticky workload is still waiting for preemption to finish.

Setup

The setup is similar to #6929.

ClusterQueues:

  • cq1:

    • nominal CPU = 3

    • borrowingLimit = 0

    • preemption:

      preemption:
        reclaimWithinCohort: Any
        withinClusterQueue: LowerPriority
  • cq2: nominal CPU = 0, borrowingLimit = 3

LocalQueues:

  • lq1 and lq2, corresponding to cq1 and cq2

PriorityClasses:

  • high = 10000
  • low = 1000

Jobs:

Job CPU Priority ClusterQueue
j01 2 (default) cq2
j02 4 low cq1
j03 2 low cq1
j04 2 high cq1

Repro Steps

  1. Submit j01admitted
  2. Submit j02pending (exceeds available CPU)
  3. Submit j03, then j04 in order

Observed behavior:

  • j03 enters scheduling cycle first and triggers preemption of j01
  • Due to sticky logic, j03 is pinned at the head of the queue until preemption completes
  • Meanwhile, j04 (higher priority) cannot proceed
  • After several scheduling cycles j03 is eventually admitted
  • When j04 runs next, it preempts j03

Expected behavior

When a higher-priority workload (j04) enters the queue, it should preempt or take precedence over lower-priority workloads, even if one of them is currently “sticky.” The sticky behavior should not override priority-based scheduling order.

Impact

This can lead to:

  • Unintended scheduling delays for higher-priority workloads
  • Inversion of expected priority ordering
  • Reduced responsiveness of the scheduler under preemption-heavy workloads
  • Unordered scheduling may result in preemption churn, as illustrated in the example where j03 blocks j04, with eventual j04 preemption of j03

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions