Skip to content

Commit 722b269

Browse files
chore(docs): update docs (#7)
Signed-off-by: scepter914 <scepter914@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
1 parent 57609a7 commit 722b269

3 files changed

Lines changed: 132 additions & 30 deletions

File tree

.vscode/settings.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"cSpell.words": [
3+
"mlcellar"
4+
]
5+
}

README.md

Lines changed: 33 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,24 @@
33
<img src="./docs/asset/logo.png" width="250" alt="ml-cellar logo">
44

55
`ml-cellar` provides a CLI to support model registry, storing ML models like a wine cellar and enabling minimal MLOps.
6-
By using GitLFS, `ml-cellar` offers essential MLOps functions including Artifact Store, Model Registry, and Serving.
6+
By using Git LFS, `ml-cellar` offers essential MLOps functions including Artifact Store, Model Registry, and Serving.
77

88
If MLOps at BigTech companies can be compared to a large-scale winery, `ml-cellar` functions like a wine cellar in a small brewery.
99
While AI projects should ideally adopt software like MLflow for MLOps, many projects and organizations cannot afford the development resources that BigTech companies have.
1010
As a result, existing MLOps software has primarily targeted companies that can allocate significant development costs to MLOps (like BigTech).
1111
`ml-cellar` makes "compromises" on MLOps by focusing only on the essential functions, enabling easy adoption of minimal MLOps.
1212

13+
- Feature
14+
- [ ] [Custom Transfer Agent](docs/docs_custom_transfer_agent.md)
15+
16+
## GitHub Git LFS usage & billing notice
17+
18+
This repository uses Git LFS (Large File Storage) to manage large files.
19+
Please note that **GitHub Git LFS has storage and bandwidth limits**.
20+
If the free quota included in your GitHub plan is exceeded, **you may incur additional charges**, or LFS uploads and downloads may be restricted depending on your billing settings.
21+
Before cloning or pushing large files, check your GitHub plan and the repository’s current Git LFS usage.
22+
If you need to handle many large files in your model registry, consider using a [Custom Transfer Agent](docs/docs_custom_transfer_agent.md) to replace the storage backend with AWS S3 or a similar service.
23+
1324
## Contribution
1425

1526
If you want to contribute this project, please see the documents.
@@ -23,19 +34,21 @@ If you want to see example of the usage of `ml-cellar` as minimum MLOps, please
2334

2435
### 1. Install
2536

26-
- Install Rust
27-
- See [the official document](https://doc.rust-lang.org/cargo/getting-started/installation.html).
37+
- Install GitLFS
38+
- See [the official document](https://github.com/git-lfs/git-lfs/wiki/Installation)
2839

2940
```sh
30-
curl https://sh.rustup.rs -sSf | sh
41+
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
42+
sudo apt-get update
43+
sudo apt-get install git-lfs
44+
git lfs install
3145
```
3246

33-
- Install dependency
47+
- Install Rust
48+
- See [the official document](https://doc.rust-lang.org/cargo/getting-started/installation.html).
3449

3550
```sh
36-
sudo apt install git-lfs awscli
37-
pip install aws2-wrap
38-
cargo install lfs-dal
51+
curl https://sh.rustup.rs -sSf | sh
3952
```
4053

4154
- Install `ml-cellar`
@@ -68,39 +81,29 @@ cd {your_ml_registry}
6881
ml-cellar init
6982
```
7083

71-
- Edit `.ml_cellar.toml`
84+
- Edit `.gitattributes` if you want to add for GitLFS
7285

73-
```toml
74-
[ml_cellar]
75-
custom_transfer_agent = false
76-
```
77-
78-
### 3. (Option) Setup with AWS S3
86+
```txt
87+
# --- Log ---
88+
*.log filter=lfs diff=lfs merge=lfs -text
7989
80-
Git LFS on GitHub has about 1 GB limit.
81-
If you need to handle files larger than 1 GB in the repository, I recommend integrating with AWS S3.
90+
# --- Your data ---
91+
*.db filter=lfs diff=lfs merge=lfs -text
92+
```
8293

83-
- Edit `.ml_cellar.toml` and change to `custom_transfer_agent = true`.
94+
- Edit `.mlcellar.toml`
8495

8596
```toml
8697
[ml_cellar]
87-
custom_transfer_agent = true
88-
```
98+
use_custom_transfer_agent = false
8999

90-
- Edit `.lfsdalconfig` if you use AWS S3.
100+
[aws]
91101

92-
```txt
93-
[lfs-dal]
94-
scheme = s3
95-
bucket = your_bucket
96-
region = your_region
97102
```
98103

99-
- Run setup command
104+
### 3. (Option) Setup with AWS S3
100105

101-
```sh
102-
ml-cellar setup
103-
```
106+
If you need to handle many large files in your model registry, consider using a [Custom Transfer Agent](docs/docs_custom_transfer_agent.md) to replace the storage backend with AWS S3 or a similar service.
104107

105108
### 4. Start project
106109

docs/docs_custom_transfer_agent.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
2+
# `ml-cellar` with Custom Transfer Agent
3+
## Purpose
4+
### Limitation in GitHub Git LFS
5+
6+
According to [the GitHub documentation](https://docs.github.com/en/billing/concepts/product-billing/git-lfs
7+
), Git LFS on GitHub has a storage quota limit of 10 GB for Free and Pro accounts, as well as bandwidth limits.
8+
It also has limitations such as not being able to choose the server region.
9+
Using a Custom Transfer Agent, you can operate these on AWS S3 and are free from these limitations.
10+
(You can address the storage limitation by paying GitHub for additional capacity.)
11+
12+
### Compare fee between GitHub Git LFS and AWS S3
13+
14+
(Information on 2025/12/01)
15+
16+
GitHub includes monthly download bandwidth and storage for Git LFS, depending on your plan:
17+
18+
- Free / Pro / Free for organizations: 10 GiB/month bandwidth + 10 GiB storage
19+
- Team / Enterprise Cloud: 250 GiB/month bandwidth + 250 GiB storage
20+
21+
Notes:
22+
23+
- Bandwidth resets monthly, but storage does not reset (it accumulates over time).
24+
- Downloads by GitHub Actions and collaborators also count against the repo owner’s LFS bandwidth.
25+
26+
If you exceed the included quota and have billing enabled:
27+
28+
- Storage: $0.07 per GiB-month
29+
- Download bandwidth: $0.0875 per GiB
30+
31+
Maximum LFS object size depends on the plan in GitHub-hosted LFS:
32+
33+
- Free / Pro: 2 GB
34+
- Team: 4 GB
35+
- Enterprise Cloud: 5 GB
36+
37+
By using custom transfer backend, you can replace to Amazon S3.
38+
39+
- Storage: about $0.023 per GB-month (first 50 TB tier)
40+
- Data transfer: first 100 GB/month free, then roughly $0.09 per GB
41+
42+
### Example use case
43+
44+
Assumptions:
45+
- Stored: 200 GiB
46+
- Downloaded per month: 1 TiB (≈ 1024 GiB)
47+
48+
- 1. GitHub Free + AWS S3
49+
- GitHub: $0
50+
- S3 storage: ~200 GB × $0.023 ≈ $4.60 / month (region may vary)
51+
- S3 egress: (1024 − 100) GB × $0.09 = $83.16 / month
52+
- Total ≈ $87.76 / month
53+
- 2. GitHub Enterprise Cloud + GitHub-hosted LFS
54+
- GitHub Enterprise Cloud: $21 × N users / month
55+
- Overage bandwidth: (1024 − 250) GiB × $0.0875 = $67.73 / month
56+
- Total ≈ ($21 × N) + $67.73 / month
57+
58+
## Get started
59+
### Install
60+
61+
- Install dependency
62+
63+
```sh
64+
sudo apt install git-lfs awscli
65+
pip install aws2-wrap
66+
cargo install lfs-dal
67+
```
68+
69+
### Setup
70+
71+
- Edit `.mlcellar.toml` and change to `custom_transfer_agent = true`.
72+
73+
```toml
74+
[ml_cellar]
75+
use_custom_transfer_agent = true
76+
77+
[aws]
78+
profile = "{your_profile}"
79+
```
80+
81+
- Edit `.lfsdalconfig` if you use AWS S3.
82+
83+
```txt
84+
[lfs-dal]
85+
scheme = s3
86+
bucket = your_bucket
87+
region = your_region
88+
```
89+
90+
- Run setup command
91+
92+
```sh
93+
ml-cellar setup
94+
```

0 commit comments

Comments
 (0)