Skip to content

Commit 0968418

Browse files
authored
Merge pull request #99 from amikos-tech/codex/releases-fallback-finalize
Harden release downloader and finalize releases-first model
2 parents c5931d2 + 610d392 commit 0968418

15 files changed

Lines changed: 1064 additions & 225 deletions

.github/workflows/benchmark.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ jobs:
4444
uses: ./.github/actions/setup-rust
4545

4646
- name: Build Rust library
47-
run: cargo build --release
47+
run: cargo build --release --locked
4848

4949
- name: Run benchmarks
5050
run: |
@@ -95,7 +95,7 @@ jobs:
9595
run: go install golang.org/x/perf/cmd/benchstat@latest
9696

9797
- name: Build Rust library
98-
run: cargo build --release
98+
run: cargo build --release --locked
9999

100100
- name: Run benchmarks on PR branch
101101
run: |

.github/workflows/go-release.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,8 +108,8 @@ jobs:
108108
109109
### Environment Variables
110110
- `TOKENIZERS_LIB_PATH`: Override library path
111-
- `TOKENIZERS_GITHUB_REPO`: Custom GitHub repository
112111
- `TOKENIZERS_VERSION`: Specific Rust library version to use
112+
- `GITHUB_TOKEN` / `GH_TOKEN`: Optional GitHub API auth for fallback requests
113113
114114
### Documentation
115115
See the [README](https://github.com/${{ github.repository }}) for detailed usage instructions.

.github/workflows/test-download.yml

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -36,16 +36,10 @@ jobs:
3636
- name: Test automatic download
3737
env:
3838
TOKENIZERS_VERSION: ${{ github.event.inputs.test_version || 'latest' }}
39+
TOKENIZERS_REQUIRE_ONLINE_TESTS: "1"
3940
run: |
40-
go test -v -run "TestDownloadFunctionality" -timeout=10m
41+
go test -v -run "TestDownloadLibraryFromGitHub|TestDownloadFunctionality/GetAvailableVersions" -timeout=10m
4142
4243
- name: Test library info
4344
run: |
4445
go test -v -run "TestGetLibraryInfo"
45-
46-
- name: Test with environment variable override
47-
env:
48-
TOKENIZERS_GITHUB_REPO: ${{ github.repository }}
49-
TOKENIZERS_VERSION: ${{ github.event.inputs.test_version || 'latest' }}
50-
run: |
51-
go test -v -run "TestDownloadFunctionality/GetAvailableVersions" -timeout=5m

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,5 @@ go.work.sum
2020
.env
2121
/unit.xml
2222
target/
23+
.agents/skills/releasing-to-r2/
24+
skills-lock.json

CLAUDE.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@ The system follows a priority order for loading the tokenizer library:
109109
1. User-provided path via `WithLibraryPath()` option
110110
2. `TOKENIZERS_LIB_PATH` environment variable
111111
3. Cached library in platform-specific directory
112-
4. Automatic download from GitHub releases to cache
112+
4. Automatic download from `releases.amikos.tech` (with GitHub Releases fallback) to cache
113113

114114
### Version Management
115115
The project uses a single version from `Cargo.toml` for both the library and ABI compatibility:
@@ -139,7 +139,8 @@ The project uses a single version from `Cargo.toml` for both the library and ABI
139139

140140
**Download System (download.go)**
141141
- Automatic platform detection and asset selection
142-
- GitHub releases integration with checksum verification
142+
- Primary `releases.amikos.tech` endpoint with GitHub Releases fallback
143+
- Checksum verification for all downloaded assets
143144
- Intelligent caching in OS-appropriate directories
144145

145146
**Rust Layer (src/lib.rs)**
@@ -171,9 +172,8 @@ For detailed cache structure and management, see `docs/CACHE_MANAGEMENT.md`.
171172
## Environment Variables
172173

173174
- `TOKENIZERS_LIB_PATH`: Override library path
174-
- `TOKENIZERS_GITHUB_REPO`: Custom GitHub repository (default: `amikos-tech/pure-tokenizers`)
175175
- `TOKENIZERS_VERSION`: Specific version to download (default: `latest`)
176-
- `GITHUB_TOKEN` or `GH_TOKEN`: GitHub authentication for API requests
176+
- `GITHUB_TOKEN` or `GH_TOKEN`: Optional GitHub authentication for fallback API requests
177177
- `HF_TOKEN`: HuggingFace authentication token for private/gated models
178178
- `HF_HUB_CACHE`: Override HuggingFace cache directory
179179
- `HF_MAX_TOKENIZER_SIZE`: Maximum tokenizer file size in bytes (default: 524288000 / 500MB)
@@ -211,4 +211,4 @@ make clean
211211
- **Memory Safety**: Proper cleanup of FFI resources with defer statements
212212
- **Buffer Management**: Zero-copy where possible, explicit memory management for C strings
213213
- **Cross-platform**: Uses runtime detection for platform-specific library names and paths
214-
- Always lint both golang and rust before commiting or pushing code
214+
- Always lint both golang and rust before commiting or pushing code

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

README.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -259,9 +259,8 @@ tokenizer, err := tokenizers.FromFile("tokenizer.json",
259259
| Variable | Description | Default |
260260
|----------|-------------|---------|
261261
| `TOKENIZERS_LIB_PATH` | Custom library path | Auto-detect |
262-
| `TOKENIZERS_GITHUB_REPO` | GitHub repo for downloads | `amikos-tech/pure-tokenizers` |
263262
| `TOKENIZERS_VERSION` | Library version to download | `latest` |
264-
| `GITHUB_TOKEN` | GitHub API token (for rate limits) | None |
263+
| `GITHUB_TOKEN` / `GH_TOKEN` | Optional token for GitHub API/authenticated fallback requests | unset |
265264

266265
### Library Loading Options
267266

@@ -274,7 +273,7 @@ tokenizer, err := tokenizers.FromFile("tokenizer.json",
274273
// 1. User-provided path via WithLibraryPath()
275274
// 2. TOKENIZERS_LIB_PATH environment variable
276275
// 3. Cached library in platform directory
277-
// 4. Automatic download from GitHub releases
276+
// 4. Automatic download from releases.amikos.tech (with GitHub Releases fallback)
278277
```
279278

280279
### Cache Management
@@ -291,6 +290,11 @@ err := tokenizers.ClearLibraryCache()
291290

292291
// Download and cache a specific version
293292
err := tokenizers.DownloadAndCacheLibraryWithVersion("v0.1.0")
293+
294+
// Discover release versions
295+
versions, err := tokenizers.GetAvailableVersions()
296+
// Note: current release metadata exposes only latest.json,
297+
// so this returns at most one latest version.
294298
```
295299

296300
#### HuggingFace Cache
@@ -406,4 +410,4 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
406410

407411
## Acknowledgments
408412

409-
Built on top of the excellent [Hugging Face Tokenizers](https://github.com/huggingface/tokenizers) library.
413+
Built on top of the excellent [Hugging Face Tokenizers](https://github.com/huggingface/tokenizers) library.

docs/CI-CD.md

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -25,37 +25,36 @@ The project uses GitHub Actions for CI/CD with multiple workflows to ensure code
2525
- Go linting with golangci-lint
2626
- Caching for faster builds
2727

28-
### 2. Build and Release Workflow (`.github/workflows/build-and-release.yml`)
28+
### 2. Rust Release Workflow (`.github/workflows/rust-release.yml`)
2929

30-
**Triggers:** Git tags starting with 'v', Pull Requests, Manual dispatch
30+
**Triggers:** Rust release tags (`rust-v*`)
3131

32-
**Purpose:** Build libraries for all supported platforms and create releases
32+
**Purpose:** Build native library artifacts for all supported platforms and publish them to the releases endpoint
3333

3434
**Supported Platforms:**
3535
- Linux: `x86_64-unknown-linux-gnu`, `aarch64-unknown-linux-gnu`
3636
- macOS: `x86_64-apple-darwin`, `aarch64-apple-darwin`
3737
- Windows: `x86_64-pc-windows-msvc`
3838

3939
**Jobs:**
40-
- **build**: Cross-compiles for all target platforms
41-
- **test**: Tests Go bindings with built libraries
42-
- **release**: Creates GitHub releases with all platform assets
40+
- **build**: Cross-compiles for all target platforms and uploads platform-specific `libtokenizers-*.tar.gz` artifacts
41+
- **release**: Generates `SHA256SUMS` and per-asset checksums, signs/verifies artifacts, then publishes to the releases endpoint
4342

4443
**Artifacts:**
4544
- Platform-specific tar.gz archives containing the shared libraries
4645
- SHA256 checksum files for each archive
4746
- Automatic GitHub release creation for tagged versions
4847

49-
### 3. Cross Compilation Test (`.github/workflows/cross-compile.yml`)
48+
### 3. Go Release Workflow (`.github/workflows/go-release.yml`)
5049

51-
**Triggers:** Changes to Rust source code or build configuration
50+
**Triggers:** Go release tags (`v*`)
5251

53-
**Purpose:** Verify cross-compilation works for all targets
52+
**Purpose:** Validate Go module release flow against published Rust artifacts and publish Go module release metadata
5453

5554
**Features:**
56-
- Tests compilation for all supported targets
57-
- Includes additional targets like musl variants
58-
- Verifies library files are created correctly
55+
- Tests downloads against released native artifacts
56+
- Verifies Go bindings against released libraries
57+
- Publishes Go release output
5958

6059
### 4. Download Functionality Test (`.github/workflows/test-download.yml`)
6160

@@ -121,7 +120,7 @@ This installs:
121120

122121
# Or step by step
123122
make build
124-
make test-v2
123+
make test
125124

126125
# Test specific functionality
127126
make test-download
@@ -157,7 +156,6 @@ Examples:
157156

158157
### User Environment Variables
159158

160-
- `TOKENIZERS_GITHUB_REPO`: Override GitHub repository for downloads
161159
- `TOKENIZERS_VERSION`: Specify version to download
162160
- `TOKENIZERS_LIB_PATH`: Override library path
163161

@@ -217,4 +215,4 @@ When contributing:
217215
- **CI Status**: Check GitHub Actions tab
218216
- **Release Status**: Monitor releases page
219217
- **Download Stats**: Available in GitHub insights
220-
- **Test Coverage**: Generated in CI artifacts
218+
- **Test Coverage**: Generated in CI artifacts

docs/DEPLOYMENT.md

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ This guide covers the deployment and release process for the CGo-free Tokenizers
1818

1919
3. **Monitor the build:**
2020
- Go to GitHub Actions tab
21-
- Watch the "Build and Release" workflow
21+
- Watch the `rust-release.yml` and `go-release.yml` workflows
2222
- Release will be created automatically when complete
2323

2424
## Supported Platforms
@@ -31,13 +31,14 @@ The CI system builds for the following platforms:
3131
| Linux | ARM64 | `aarch64-unknown-linux-gnu` | `libtokenizers.so` |
3232
| macOS | Intel | `x86_64-apple-darwin` | `libtokenizers.dylib` |
3333
| macOS | Apple Silicon | `aarch64-apple-darwin` | `libtokenizers.dylib` |
34-
| Windows | x86_64 | `x86_64-pc-windows-msvc` | `libtokenizers.dll` |
34+
| Windows | x86_64 | `x86_64-pc-windows-msvc` | `tokenizers.dll` |
3535

3636
## Release Assets
3737

3838
Each release includes:
3939

4040
- **Platform-specific archives**: `libtokenizers-{arch}-{platform}.tar.gz`
41+
- **Manifest checksum file**: `SHA256SUMS` (primary verification source)
4142
- **Checksum files**: `libtokenizers-{arch}-{platform}.tar.gz.sha256`
4243
- **Automatic release notes**: Generated from commits and PRs
4344

@@ -47,20 +48,20 @@ The Go library automatically downloads the appropriate platform library:
4748

4849
```go
4950
// Automatic download
50-
tokenizer, err := tokenizers.FromFile("config.json",
51-
tokenizers.WithDownloadLibrary())
51+
tokenizer, err := tokenizers.FromFile("config.json")
5252

5353
// Manual path
5454
tokenizer, err := tokenizers.FromFile("config.json",
5555
tokenizers.WithLibraryPath("/path/to/lib"))
5656
```
5757

58+
Downloads are attempted from `releases.amikos.tech` first and fall back to GitHub Releases if needed.
59+
5860
## Environment Variables
5961

6062
### For Users
6163

6264
- `TOKENIZERS_LIB_PATH`: Override library path
63-
- `TOKENIZERS_GITHUB_REPO`: Custom repository for downloads
6465
- `TOKENIZERS_VERSION`: Specific version to download
6566

6667
### For CI/CD
@@ -97,7 +98,7 @@ make create-release-assets
9798

9899
```bash
99100
# Run all tests
100-
make test-v2
101+
make test
101102

102103
# Test download functionality
103104
make test-download
@@ -113,15 +114,15 @@ make test-rust
113114
- **Purpose**: Basic testing and validation
114115
- **Platforms**: Linux, macOS, Windows
115116

116-
### 2. Build and Release (`build-and-release.yml`)
117-
- **Trigger**: Git tags (`v*`)
118-
- **Purpose**: Create releases with all platform assets
119-
- **Features**: Cross-compilation, checksum generation, automatic releases
117+
### 2. Rust Release (`rust-release.yml`)
118+
- **Trigger**: Rust tags (`rust-v*`)
119+
- **Purpose**: Build and publish platform assets to releases endpoint
120+
- **Features**: Cross-compilation, `SHA256SUMS` + per-asset checksums, artifact publishing
120121

121-
### 3. Cross Compilation Test (`cross-compile.yml`)
122-
- **Trigger**: Changes to Rust code
123-
- **Purpose**: Verify cross-compilation works
124-
- **Targets**: All supported platforms + additional variants
122+
### 3. Go Release (`go-release.yml`)
123+
- **Trigger**: Go tags (`v*`)
124+
- **Purpose**: Validate and publish Go module releases
125+
- **Features**: Tests against released Rust artifacts and publishes module release
125126

126127
### 4. Download Test (`test-download.yml`)
127128
- **Trigger**: Weekly schedule, manual dispatch
@@ -208,4 +209,4 @@ For deployment issues:
208209
1. Check the [CI/CD documentation](CI-CD.md)
209210
2. Review GitHub Actions logs
210211
3. Test locally with provided scripts
211-
4. Open an issue with detailed error information
212+
4. Open an issue with detailed error information

0 commit comments

Comments
 (0)