You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ OSS Connector for AI/ML contains some high-performance Python libraries specific
4
4
5
5
Currently, the OSS connector is composed of two libraries: OSS Model Connector and OSS Torch Connector.
6
6
7
-
-[OSS Torch Connector](https://aliyun.github.io/oss-connector-for-ai-ml/#/torchconnector/introduction) is dedicated to AI training scenarios, including loading [datasets](https://pytorch.org/docs/stable/data.html#dataset-types) from OSS and loading/saving checkpoints from/to OSS.
7
+
-[OSS Torch Connector](https://aliyun.github.io/oss-connector-for-ai-ml/#/torchconnector/introduction) is dedicated to AI training scenarios, including loading [datasets](https://pytorch.org/docs/stable/data.html#dataset-types) from OSS and loading/saving checkpoints or [Distributed Checkpoints(DCP)](https://docs.pytorch.org/docs/stable/distributed.checkpoint.html)from/to OSS.
8
8
9
9
-[OSS Model Connector](https://aliyun.github.io/oss-connector-for-ai-ml/#/modelconnector/introduction) focuses on AI inference scenarios, loading large model files from OSS into local AI inference frameworks.
Copy file name to clipboardExpand all lines: docs/torchconnector/examples.md
+41-1Lines changed: 41 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -273,4 +273,44 @@ with checkpoint.writer(CHECKPOINT_WRITE_URI) as writer:
273
273
torch.save(state_dict, writer)
274
274
```
275
275
276
-
OssCheckpoint can be used for checkpoints, and also for high-speed uploading and downloading of arbitrary objects. In our testing environment, the download speed can exceed 15GB/s.
276
+
OssCheckpoint can be used for checkpoints, and also for high-speed uploading and downloading of arbitrary objects. In our testing environment, the download speed can exceed 15GB/s.
277
+
278
+
## Distributed checkpoints
279
+
280
+
OSS connector for AI/ML supports [PyTorch distributed checkpoints(DCP)](https://docs.pytorch.org/docs/stable/distributed.checkpoint.html) since v1.2.0rc2.
To use checkpoint-related features within Docker, the container must be run with `--privilege`. This is due to our reliance on userfaultfd to accelerate the reading of checkpoints.
Copy file name to clipboardExpand all lines: docs/torchconnector/introduction.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
## Overview
5
5
6
6
OSS Torch Connector provides both [Map-style and Iterable-style datasets](https://pytorch.org/docs/stable/data.html#dataset-types) for loading datasets from OSS.
7
-
And also provides a method for loading and saving checkpoints from and to OSS.
7
+
And also provides a method for loading and saving checkpoints or [Distributed Checkpoints(DCP)](https://docs.pytorch.org/docs/stable/distributed.checkpoint.html)from and to OSS.
8
8
9
9
The core part of is OSS Connector for AI/ML is implemented in C++ using [PhotonLibOS](https://github.com/alibaba/PhotonLibOS). This repository only contains the code of Python.
0 commit comments