Skip to content

Commit 6f9057d

Browse files
spoltiyuzisun
andauthored
Initial segregation of the storage module from KServe SDK (kserve#4391)
Signed-off-by: Spolti <fspolti@redhat.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
1 parent a3bd0a5 commit 6f9057d

File tree

12 files changed

+3452
-1
lines changed

12 files changed

+3452
-1
lines changed

.github/workflows/python-publish.yml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,16 @@ jobs:
2626
- name: Install Poetry and version plugin
2727
run: ./test/scripts/gh-actions/setup-poetry.sh
2828

29-
- name: Build and publish
29+
- name: KServe - Build and publish
3030
env:
3131
POETRY_PYPI_TOKEN_PYPI: ${{ secrets.PYPI_TOKEN }}
3232
run: |
3333
cd python/kserve
3434
poetry publish --build
35+
36+
- name: KServe Storage - Build and publish
37+
env:
38+
POETRY_PYPI_TOKEN_PYPI: ${{ secrets.PYPI_TOKEN }}
39+
run: |
40+
cd python/storage
41+
poetry publish --build

python/storage/README.md

Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
# kserve-storage
2+
3+
A Python module for handling model storage and retrieval for KServe. This package provides a unified API to download models from various storage backends including cloud providers, file systems, and model hubs.
4+
5+
## Features
6+
7+
- Support for multiple storage backends:
8+
- Local file system
9+
- Google Cloud Storage (GCS)
10+
- Amazon S3
11+
- Azure Blob Storage
12+
- Azure File Share
13+
- HTTP/HTTPS URLs
14+
- HDFS/WebHDFS
15+
- Hugging Face Hub
16+
- Automatic extraction of compressed files (zip, tar.gz, tgz)
17+
- Configuration via environment variables
18+
- Logging and error handling
19+
20+
## Installation
21+
22+
```bash
23+
pip install kserve-storage
24+
```
25+
26+
Or with Poetry:
27+
28+
```bash
29+
poetry add kserve-storage
30+
```
31+
32+
## Usage
33+
34+
The main entry point is the `Storage` class which provides a `download` method:
35+
36+
```python
37+
from kserve_storage import Storage
38+
39+
# Download from GCS to a temporary directory
40+
model_dir = Storage.download("gs://your-bucket/model")
41+
42+
# Download from S3 to a specific directory
43+
model_dir = Storage.download("s3://your-bucket/model", "/path/to/destination")
44+
```
45+
46+
## Supported Storage Providers
47+
48+
### Local File System
49+
50+
```python
51+
model_dir = Storage.download("file:///path/to/model")
52+
# or using direct path
53+
model_dir = Storage.download("/path/to/model")
54+
```
55+
56+
### Google Cloud Storage
57+
58+
```python
59+
model_dir = Storage.download("gs://bucket-name/model-path")
60+
```
61+
62+
### Amazon S3
63+
64+
```python
65+
model_dir = Storage.download("s3://bucket-name/model-path")
66+
```
67+
68+
### Azure Blob Storage
69+
70+
```python
71+
model_dir = Storage.download("https://account-name.blob.core.windows.net/container-name/model-path")
72+
```
73+
74+
### Azure File Share
75+
76+
```python
77+
model_dir = Storage.download("https://account-name.file.core.windows.net/share-name/model-path")
78+
```
79+
80+
### HTTP/HTTPS URLs
81+
82+
```python
83+
model_dir = Storage.download("https://example.com/path/to/model.zip")
84+
```
85+
86+
### HDFS
87+
88+
```python
89+
model_dir = Storage.download("hdfs://path/to/model")
90+
# or WebHDFS
91+
model_dir = Storage.download("webhdfs://path/to/model")
92+
```
93+
94+
### Hugging Face Hub
95+
96+
```python
97+
model_dir = Storage.download("hf://org-name/model-name")
98+
# With specific revision
99+
model_dir = Storage.download("hf://org-name/model-name:revision")
100+
```
101+
102+
## Environment Variables
103+
104+
### Hugging Face Hub Configuration
105+
106+
These are all handled by the `huggingface_hub` package, you can see all the available environment variables [here](https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables).
107+
108+
### AWS/S3 Configuration / Environments variables
109+
110+
- `AWS_ENDPOINT_URL`: Custom endpoint URL for S3-compatible storage
111+
- `AWS_ACCESS_KEY_ID`: Access key for S3
112+
- `AWS_SECRET_ACCESS_KEY`: Secret access key for S3
113+
- `AWS_DEFAULT_REGION`: AWS region
114+
- `AWS_CA_BUNDLE`: Path to custom CA bundle
115+
- `S3_VERIFY_SSL`: Enable/disable SSL verification
116+
- `S3_USER_VIRTUAL_BUCKET`: Use virtual hosted-style URLs
117+
- `S3_USE_ACCELERATE`: Use transfer acceleration
118+
- `awsAnonymousCredential`: Use unsigned requests for public access
119+
120+
### Azure Configuration
121+
122+
- `AZURE_STORAGE_ACCESS_KEY`: Storage account access key
123+
- `AZ_TENANT_ID` / `AZURE_TENANT_ID`: Azure AD tenant ID
124+
- `AZ_CLIENT_ID` / `AZURE_CLIENT_ID`: Azure AD client ID
125+
- `AZ_CLIENT_SECRET` / `AZURE_CLIENT_SECRET`: Azure AD client secret
126+
127+
### HDFS Configuration
128+
129+
- `HDFS_SECRET_DIR`: Directory containing HDFS configuration files
130+
- `HDFS_NAMENODE`: HDFS namenode address
131+
- `USER_PROXY`: User proxy for HDFS
132+
- `HDFS_ROOTPATH`: Root path in HDFS
133+
- `KERBEROS_PRINCIPAL`: Kerberos principal for authentication
134+
- `KERBEROS_KEYTAB`: Path to Kerberos keytab file
135+
- `TLS_CERT`, `TLS_KEY`, `TLS_CA`: TLS configuration files
136+
- `TLS_SKIP_VERIFY`: Skip TLS verification
137+
- `N_THREADS`: Number of download threads
138+
139+
## Storage Configuration
140+
141+
Storage configuration can be provided through environment variables:
142+
143+
- `STORAGE_CONFIG`: JSON string containing storage configuration
144+
- `STORAGE_OVERRIDE_CONFIG`: JSON string to override storage configuration
145+
146+
147+
## License
148+
149+
Apache License 2.0 - See [LICENSE](https://github.com/kserve/kserve/blob/master/LICENSE) for details.
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Copyright 2023 The KServe Authors.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# flake8: noqa
16+
17+
from .logging import configure_logging, logger
18+
from .kserve_storage import Storage

0 commit comments

Comments
 (0)