-
Notifications
You must be signed in to change notification settings - Fork 664
add samples for RayCluster token auth #4200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| apiVersion: ray.io/v1 | ||
| kind: RayCluster | ||
| metadata: | ||
| name: ray-cluster-with-manual-auth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the docs we will show how to enable this for KubeRay versions < 1.5.1, hence the manual version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, appreciated it
andrewsykim
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Future-Outlier PTAL
3e286cb to
7d8ea5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the file ray-operator/config/samples/ray-cluster.auth.yaml
do you mind change the yaml file to the below?
for 2 reasons:
- currently
image: rayproject/ray:nightlydoesn't include auth token's commit, butimage: rayproject/ray:nightly-py311-cpudoes - lesser resources in the
requestfield, this is more easier for most users to run it
apiVersion: ray.io/v1
kind: RayCluster
metadata:
name: ray-cluster-with-auth
spec:
enableInTreeAutoscaling: true
authOptions:
mode: token
rayVersion: '2.52.0'
headGroupSpec:
rayStartParams: {}
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:nightly-py311-cpu
resources:
limits:
memory: 8G
requests:
cpu: 2
memory: 4G
ports:
- containerPort: 6379
name: gcs-server
- containerPort: 8265
name: dashboard
- containerPort: 10001
name: client
workerGroupSpecs:
- replicas: 1
minReplicas: 1
maxReplicas: 5
groupName: workergroup
rayStartParams: {}
template:
spec:
containers:
- name: ray-worker
image: rayproject/ray:nightly-py311-cpu
resources:
limits:
memory: 8G
requests:
cpu: 2
memory: 4GThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for manual-auth, I prefer this example for several reasons:
- lesser resources
- a worked image
rayproject/ray:nightly-py311-cpu - comments to help user create their own secret, and reference the secret
- add
authOptionsin the spec
apiVersion: ray.io/v1
kind: RayCluster
metadata:
name: ray-cluster-with-manual-auth
spec:
enableInTreeAutoscaling: true
authOptions:
mode: token
rayVersion: '2.52.0'
headGroupSpec:
rayStartParams: {}
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:nightly-py311-cpu
env:
- name: RAY_AUTH_MODE
value: token
- name: RAY_AUTH_TOKEN
# You can create the secret manually with the following command:
# kubectl create secret generic ray-cluster-with-manual-auth --from-literal=auth_token='raycluster_secret' -n default
# And then use the following valueFrom to reference the secret:
valueFrom:
secretKeyRef:
key: auth_token # change this to match your Secret key
name: ray-cluster-with-manual-auth # change this to match your Secret name
resources:
limits:
memory: 8G
requests:
cpu: 2
memory: 4G
ports:
- containerPort: 6379
name: gcs-server
- containerPort: 8265
name: dashboard
- containerPort: 10001
name: client
workerGroupSpecs:
- replicas: 1
minReplicas: 1
maxReplicas: 5
groupName: workergroup
rayStartParams: {}
template:
spec:
containers:
- name: ray-worker
image: rayproject/ray:nightly-py311-cpu
env:
- name: RAY_AUTH_MODE
value: token
- name: RAY_AUTH_TOKEN
valueFrom:
secretKeyRef:
key: auth_token
name: ray-cluster-with-manual-auth # change this to match your Secret name
resources:
limits:
memory: 8G
requests:
cpu: 2
memory: 4G
2025-11-18.17-16-00.mp4
|
I noticed |
Signed-off-by: Andrew Sy Kim <[email protected]>
7d8ea5f to
054e572
Compare
* [Bug] Sidecar mode shouldn't restart head pod when head pod is deleted (#4141) * [Bug] Sidecar mode shouldn't restart head pod when head pod is deleted Signed-off-by: 400Ping <[email protected]> * [Fix] Fix e2e error Signed-off-by: 400Ping <[email protected]> * [Fix] fix according to rueian's comment Signed-off-by: 400Ping <[email protected]> * [Chore] fix ci error Signed-off-by: 400Ping <[email protected]> * Update ray-operator/controllers/ray/raycluster_controller.go Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]> Signed-off-by: Ping <[email protected]> * Update ray-operator/controllers/ray/rayjob_controller.go Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]> Signed-off-by: Ping <[email protected]> * update Signed-off-by: Future-Outlier <[email protected]> * update Signed-off-by: Future-Outlier <[email protected]> * Trigger CI Signed-off-by: Future-Outlier <[email protected]> --------- Signed-off-by: 400Ping <[email protected]> Signed-off-by: Ping <[email protected]> Signed-off-by: Future-Outlier <[email protected]> Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]> * fix: dashboard build for kuberay 1.5.0 (#4161) Signed-off-by: Future-Outlier <[email protected]> * [Feature Enhancement] Set ordered replica index label to support multi-slice (#4163) * [Feature Enhancement] Set ordered replica index label to support multi-slice Signed-off-by: Ryan O'Leary <[email protected]> * rename replica-id -> replica-name Signed-off-by: Ryan O'Leary <[email protected]> * Separate replica index feature gate logic Signed-off-by: Ryan O'Leary <[email protected]> * remove index arg in createWorkerPod Signed-off-by: Ryan O'Leary <[email protected]> --------- Signed-off-by: Ryan O'Leary <[email protected]> * update stale feature gate comments (#4174) Signed-off-by: Andrew Sy Kim <[email protected]> * [RayCluster] Add more context why we don't recreate head Pod for RayJob (#4175) Signed-off-by: Kai-Hsun Chen <[email protected]> * feature: Remove empty resource list initialization. (#4168) Fixes #4142. * [Dockerfile] [KubeRay Dashboard]: Fix Dockerfile warnings (ENV format, CMD JSON args) (#4167) * [#4166] improvement: Fix Dockerfile warnings (ENV format, CMD JSON args) * extract the hostname from CMD Signed-off-by: Neo Chien <[email protected]> --------- Signed-off-by: Neo Chien <[email protected]> Co-authored-by: cchung100m <[email protected]> * [Fix] Resolve int32 overflow by having the calculation in int64 and c… (#4158) * [Fix] Resolve int32 overflow by having the calculation in int64 and cap it if the count is over math.MaxInt32 Signed-off-by: justinyeh1995 <[email protected]> * [Test] Add unit tests for CalculateReadyReplicas Signed-off-by: justinyeh1995 <[email protected]> * [Fix] Add a nosec comment to pass the Lint (pre-commit) test Signed-off-by: justinyeh1995 <[email protected]> * [Refactor] Add CapInt64ToInt32 to replace #nosec directives Signed-off-by: justinyeh1995 <[email protected]> * [Refactor] Rename function to SafeInt64ToInt32 and add a underflowing prevention (it also help pass the lint test) Signed-off-by: justinyeh1995 <[email protected]> * [Refactor] Remove the early return as SafeInt64ToInt32 handles the int32 overflow and underflow checking. Signed-off-by: justinyeh1995 <[email protected]> --------- Signed-off-by: justinyeh1995 <[email protected]> * Add RayService incremental upgrade sample for guide (#4164) Signed-off-by: Ryan O'Leary <[email protected]> * Edit RayCluster example config for label selectors (#4151) Signed-off-by: Ryan O'Leary <[email protected]> * [RayJob] update light weight submitter image from quay.io (#4181) Signed-off-by: Future-Outlier <[email protected]> * [flaky] RayJob fails when head Pod is deleted when job is running (#4182) Signed-off-by: Future-Outlier <[email protected]> * [CI] Pin Docker api version to avoid API version mismatch (#4188) Signed-off-by: win5923 <[email protected]> * Make replicas configurable for kuberay-operator #4180 (#4195) * Make replicas configurable for kuberay-operator #4180 * Make replicas configurable for kuberay-operator #4180 * [Fix] rayjob update raycluster status (#4192) * feat: check if raycluster status update in rayjob * test: e2e test to check the rayjob raycluster status update * fix: dashboard http client tests discovered and passing (#4173) Signed-off-by: alimaazamat <[email protected]> * [RayJob] Lift cluster status while initializing (#4191) Signed-off-by: Spencer Peterson <[email protected]> * [RayJob] Remove updateJobStatus call (#4198) Fast follow to #4191 Signed-off-by: Spencer Peterson <[email protected]> * Add support for Ray token auth (#4179) * Add support for Ray token auth Signed-off-by: Andrew Sy Kim <[email protected]> * add e2e test for Ray cluster auth Signed-off-by: Andrew Sy Kim <[email protected]> * address nits from Ruiean Signed-off-by: Andrew Sy Kim <[email protected]> * update RAY_auth_mode -> RAY_AUTH_MODE Signed-off-by: Andrew Sy Kim <[email protected]> * configure auth for Ray autoscaler Signed-off-by: Andrew Sy Kim <[email protected]> --------- Signed-off-by: Andrew Sy Kim <[email protected]> * Bump js-yaml from 4.1.0 to 4.1.1 in /dashboard (#4194) Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 4.1.0 to 4.1.1. - [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md) - [Commits](nodeca/js-yaml@4.1.0...4.1.1) --- updated-dependencies: - dependency-name: js-yaml dependency-version: 4.1.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * update minimum Ray version required for token authentication to 2.52.0 (#4201) * update minimum Ray version required for token authentication to 2.52.0 Signed-off-by: Andrew Sy Kim <[email protected]> * update RayCluster auth e2e test to use Ray v2.52 Signed-off-by: Andrew Sy Kim <[email protected]> --------- Signed-off-by: Andrew Sy Kim <[email protected]> * add samples for RayCluster token auth (#4200) Signed-off-by: Andrew Sy Kim <[email protected]> * update (#4208) Signed-off-by: Future-Outlier <[email protected]> * [RayJob] Add token authentication support for All mode (#4210) * dashboard client authentication support Signed-off-by: Future-Outlier <[email protected]> * support rayjob Signed-off-by: Future-Outlier <[email protected]> * update to fix api serverr err Signed-off-by: Future-Outlier <[email protected]> * update Signed-off-by: Future-Outlier <[email protected]> * updarte Signed-off-by: Future-Outlier <[email protected]> * Rayjob sidecar mode auth token mode support Signed-off-by: Future-Outlier <[email protected]> * RayJob support k8s job mode Signed-off-by: Future-Outlier <[email protected]> * update Signed-off-by: Future-Outlier <[email protected]> * update Signed-off-by: Future-Outlier <[email protected]> * update Signed-off-by: Future-Outlier <[email protected]> * Address Andrew's advice Signed-off-by: Future-Outlier <[email protected]> * add todo x-ray-authorization comments Signed-off-by: Future-Outlier <[email protected]> --------- Signed-off-by: Future-Outlier <[email protected]> * [RayCluster] Enable Secret informer watch/list and remove unused RBAC verbs (#4202) * Add authentication secret reconciliation support Signed-off-by: Future-Outlier <[email protected]> * update Signed-off-by: Future-Outlier <[email protected]> * update Signed-off-by: Future-Outlier <[email protected]> * fix flaky test Signed-off-by: Future-Outlier <[email protected]> * remove test fix Signed-off-by: Rueian <[email protected]> --------- Signed-off-by: Future-Outlier <[email protected]> Signed-off-by: Rueian <[email protected]> Co-authored-by: Rueian <[email protected]> * [APIServer][Docs] Add user guide for retry behavior & configuration (#4144) * [Docs] Add the draft description about feature intro, configurations, and usecases Signed-off-by: justinyeh1995 <[email protected]> * [Fix] Update the retry walk-through Signed-off-by: justinyeh1995 <[email protected]> * [Doc] rewrite the first 2 sections Signed-off-by: justinyeh1995 <[email protected]> * [Doc] Revise documentation wording and add Observing Retry Behavior section Signed-off-by: justinyeh1995 <[email protected]> * [Fix] fix linting issue by running pre-commit run berfore commiting Signed-off-by: justinyeh1995 <[email protected]> * [Fix] fix linting errors in the Markdown linting Signed-off-by: justinyeh1995 <[email protected]> * [Fix] Clean up the math equation Signed-off-by: justinyeh1995 <[email protected]> * Update the math formula of Backoff calculation. Co-authored-by: Nary Yeh <[email protected]> Signed-off-by: JustinYeh <[email protected]> * [Fix] Explicitly mentioned exponential backoff and removed the customization parts Signed-off-by: justinyeh1995 <[email protected]> * [Docs] Clarify naming by replacing “APIServer” with “KubeRay APIServer” Co-authored-by: Cheng-Yeh Chung <[email protected]> Signed-off-by: JustinYeh <[email protected]> * [Docs] Rename retry-configuration.md to retry-behavior.md for accuracy Signed-off-by: justinyeh1995 <[email protected]> * Update Title to KubeRay APIServer Retry Behavior Co-authored-by: Cheng-Yeh Chung <[email protected]> Signed-off-by: JustinYeh <[email protected]> * [Docs] Add a note about the limitation of retry configuration Signed-off-by: justinyeh1995 <[email protected]> --------- Signed-off-by: justinyeh1995 <[email protected]> Signed-off-by: JustinYeh <[email protected]> Co-authored-by: Nary Yeh <[email protected]> Co-authored-by: Cheng-Yeh Chung <[email protected]> * Support X-Ray-Authorization fallback header for accepting auth token via proxy (#4213) * Support X-Ray-Authorization fallback header for accepting auth token in dashboard Signed-off-by: Future-Outlier <[email protected]> * remove todo comment Signed-off-by: Future-Outlier <[email protected]> --------- Signed-off-by: Future-Outlier <[email protected]> * [RayCluster] make auth token secret name consistency (#4216) Signed-off-by: fscnick <[email protected]> * [RayCluster] Status includes head containter status message (#4196) * [RayCluster] Status includes head containter status message Signed-off-by: Spencer Peterson <[email protected]> * lint Signed-off-by: Spencer Peterson <[email protected]> * [RayCluster] Containers not ready status reflects structured reason Signed-off-by: Spencer Peterson <[email protected]> * nit Signed-off-by: Spencer Peterson <[email protected]> --------- Signed-off-by: Spencer Peterson <[email protected]> * Remove erroneous call in applyServeTargetCapacity (#4212) Signed-off-by: Ryan O'Leary <[email protected]> * [RayJob] Add token authentication support for light weight job submitter (#4215) * [RayJob] light weight job submitter auth token support Signed-off-by: Future-Outlier <[email protected]> * X-Ray-Authorization Signed-off-by: Rueian <[email protected]> --------- Signed-off-by: Future-Outlier <[email protected]> Signed-off-by: Rueian <[email protected]> Co-authored-by: Rueian <[email protected]> * feat: kubectl ray get token command (#4218) * feat: kubectl ray get token command Signed-off-by: Rueian <[email protected]> * Update kubectl-plugin/pkg/cmd/get/get_token_test.go Co-authored-by: Copilot <[email protected]> Signed-off-by: Rueian <[email protected]> * Update kubectl-plugin/pkg/cmd/get/get_token.go Co-authored-by: Copilot <[email protected]> Signed-off-by: Rueian <[email protected]> * make sure the raycluster exists before getting the secret Signed-off-by: Rueian <[email protected]> * better ux Signed-off-by: Rueian <[email protected]> * Update kubectl-plugin/pkg/cmd/get/get_token.go Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]> Signed-off-by: Rueian <[email protected]> --------- Signed-off-by: Rueian <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]> --------- Signed-off-by: 400Ping <[email protected]> Signed-off-by: Ping <[email protected]> Signed-off-by: Future-Outlier <[email protected]> Signed-off-by: Ryan O'Leary <[email protected]> Signed-off-by: Andrew Sy Kim <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]> Signed-off-by: Neo Chien <[email protected]> Signed-off-by: justinyeh1995 <[email protected]> Signed-off-by: win5923 <[email protected]> Signed-off-by: alimaazamat <[email protected]> Signed-off-by: Spencer Peterson <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Rueian <[email protected]> Signed-off-by: JustinYeh <[email protected]> Signed-off-by: fscnick <[email protected]> Co-authored-by: Ping <[email protected]> Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]> Co-authored-by: Ryan O'Leary <[email protected]> Co-authored-by: Kai-Hsun Chen <[email protected]> Co-authored-by: Kavish <[email protected]> Co-authored-by: Neo Chien <[email protected]> Co-authored-by: cchung100m <[email protected]> Co-authored-by: JustinYeh <[email protected]> Co-authored-by: Jun-Hao Wan <[email protected]> Co-authored-by: Divyam Raj <[email protected]> Co-authored-by: Nary Yeh <[email protected]> Co-authored-by: Alima Azamat <[email protected]> Co-authored-by: Spencer Peterson <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Rueian <[email protected]> Co-authored-by: Cheng-Yeh Chung <[email protected]> Co-authored-by: fscnick <[email protected]> Co-authored-by: Copilot <[email protected]>
Why are these changes needed?
Add token auth examples that will be referenced in docs
Related issue number
#4203
Checks