You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/docs/installation/modal/index.md
-30Lines changed: 0 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -130,36 +130,6 @@ def app():
130
130
131
131
Once we deploy this model with `modal serve app.py`, it will output the url of the web endpoint, in a form of `https://<USERNAME>--tabby-server-starcoder-1b-app-dev.modal.run`.
132
132
133
-
To test if the server is working, you can send a post request to the web endpoint.
Copy file name to clipboardExpand all lines: website/docs/installation/skypilot/index.md
+3-21Lines changed: 3 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,11 +21,11 @@ resources:
21
21
22
22
Skypilot supports GPU from various cloud vendors. Please refer to the official [Skypilot documentation](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html) for detailed installation instructions.
23
23
24
-
As Tabby exposes its health check at `/v1/health`, we can define the following service configuration:
24
+
Tabby exposes its health check at the `/metrics` endpoint, which also serves as a prometrics endpoint. Therefore, we can define the following readiness probe:
25
25
26
26
```yaml
27
27
service:
28
-
readiness_probe: /v1/health
28
+
readiness_probe: /metrics
29
29
replicas: 1
30
30
```
31
31
@@ -52,7 +52,7 @@ This finishes launching SkyServe's control VM which runs a load balancer for thi
52
52
When you execute the following command, you'll encounter a message indicating that the replica is not ready:
53
53
54
54
```bash
55
-
$ curl -L 'http://44.203.34.65:30001/v1/health'
55
+
$ curl -L 'http://44.203.34.65:30001/metrics'
56
56
57
57
{"detail":"No available replicas. Use \"sky serve status [SERVICE_NAME]\" to check the replica status."}%
58
58
```
@@ -68,22 +68,4 @@ Once the service is ready, you will see something like the following:
68
68
69
69

70
70
71
-
SkyServe uses a redirect load balancer at its front, so the `-L` command is necessary if you would like to test the completion api with `curl`.
72
-
73
-
```bash
74
-
$ curl -L -X 'POST' \
75
-
'http://44.203.34.65:30001/v1/completions' \
76
-
-H 'accept: application/json' \
77
-
-H 'Content-Type: application/json' \
78
-
-d '{
79
-
"language": "python",
80
-
"segments": {
81
-
"prefix": "def fib(n):\n ",
82
-
"suffix": "\n return fib(n - 1) + fib(n - 2)"
83
-
}
84
-
}'
85
-
86
-
{"id":"cmpl-ba9aae81-ed9c-419b-9616-fceb92cdbe79","choices":[{"index":0,"text":" if n <= 1:\n return n"}]}
87
-
```
88
-
89
71
Now, you can utilize the load balancer URL (`http://44.203.34.65:30001` in this case) within Tabby editor extensions. Please refer to [`tabby.yaml`](https://github.com/TabbyML/tabby/blob/main/website/docs/installation/skypilot/tabby.yaml) for the full configuration used in this tutorial.
0 commit comments