You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Julia binding for [libcozip](https://github.com/asterisk-labs/taco/tree/main/cozip) — pack files into a Cloud-Optimized ZIP archive readable over HTTP range requests.
3
+
Julia binding for libcozip. Open a Cloud-Optimized ZIP archive like a table over HTTP range requests, or write one from a DataFrame.
4
4
5
-
The native `libcozip` binary is fetched automatically via Julia Artifacts; no C toolchain required.
5
+
The native `libcozip` binary is fetched automatically via Julia Artifacts, no C toolchain required.
For inspecting or tuning the `__metadata__` parquet between steps:
32
+
`name` is how each file appears inside the archive. `path` is where it lives on disk, consumed at write time and dropped from the manifest. Any additional columns ride along into `__metadata__` and become queryable on read.
`create_options` is passed through to DuckDB's `COPY ... TO '...' (FORMAT parquet, <opts>)`, so anything DuckDB accepts works (e.g. `"COMPRESSION 'zstd', ROW_GROUP_SIZE 100000"`).
Any additional columns in `table` are propagated into `__metadata__`:
52
+
`manifest` is a DataFrame with `name`, `offset`, `size`, plus whatever extras the writer added. Local file or remote URL, same call. Only the byte-0 index and the embedded `__metadata__` Parquet are fetched, never the user payloads.
53
+
54
+
Filter the manifest like any DataFrame, then use `offset` and `size` to range-request payloads.
The optional `in_index` column (default `true`) controls whether each entry is recorded in the cozip index — entries with `in_index = false` go into the ZIP but aren't exposed as cozip-indexed entries.
67
-
68
69
## Versioning
69
70
70
-
`Cozip.jl` tracks the C library, which uses CalVer with 4 components (e.g. `2026.5.2.6`). The Julia package itself uses 3 components because Julia enforces strict SemVer; the fourth is exposed via:
71
+
`Cozip.jl` tracks the C library. The C side uses 4-component CalVer (e.g. `2026.5.2.6`). The Julia side uses the first three because Julia enforces strict SemVer. The fourth component is exposed at runtime.
71
72
72
73
```julia
73
74
using Cozip
74
-
Cozip.LibCozip.cozip_version() #→ "2026.5.2.6"
75
+
Cozip.LibCozip.cozip_version() # "2026.5.2.6"
75
76
```
76
77
77
78
## Spec
78
79
79
-
See the [cozip spec](https://github.com/asterisk-labs/taco/tree/main/cozip) for the on-disk format.
80
+
See [SPEC.md](https://github.com/asterisk-labs/cozip/blob/main/SPEC.md) for the on-disk format.
0 commit comments