[feature] response:stream-binary-resource — zero-copy binary download#6466
[feature] response:stream-binary-resource — zero-copy binary download#6466joewiz wants to merge 2 commits into
Conversation
response:stream-binary($binary, ...) requires the caller to first
materialize the whole resource into the JVM heap (e.g. via
util:binary-doc()), so a large download holds the entire resource in
memory before a byte goes out.
Add response:stream-binary-resource($binary-resource-path, $content-type,
$filename?), which opens the stored BinaryDocument and copies it straight
to the servlet response output stream via broker.readBinaryResource(Txn,
...) -- the same zero-copy path RESTServer uses for binary downloads --
without ever materializing the bytes. Intended for large downloads.
A client (e.g. existdb-openapi's GET /api/db/resource/{path}) can then
replace util:binary-doc(...) => response:stream-binary(...) with a single
response:stream-binary-resource($path, ...) call.
Adds RestBinariesTest#streamBinaryResourceRaw asserting the wire result is
byte-identical over a real HTTP request. RestBinariesTest 4/4 green;
Codacy PMD clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
That is BIG! |
| try { | ||
| uri = XmldbURI.xmldbUriFor(path); | ||
| } catch (final URISyntaxException e) { | ||
| throw new XPathException(this, "Invalid binary resource path: " + path, e); |
There was a problem hiding this comment.
I would like to have a fitting Error code here, too
…tent-Disposition test Address @line-o's review on eXist-db#6466: every failure path now carries a fitting error code instead of a bare XPathException, and the 3-arg (filename) form has explicit Content-Disposition coverage. - Resource is not a binary document -> XPTY0004 (a type error, per the review). - Invalid resource path / permission denied -> FODC0002 ("Error retrieving resource"), the same code already used for the not-found case, so all three "cannot get the resource" conditions share one code. - Transaction / IO error while streaming -> new EXXQDY0007 ("I/O error while streaming a binary resource to the response"), in the eXist error namespace. - RestBinariesTest: add streamBinaryResourceWithFilename, asserting the 3-arg form sends Content-Disposition: inline; filename="..." with a byte-identical body. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
[This response was co-authored with Claude Code. -Joe] Thanks @line-o — all addressed in df06128. Every failure path now carries a fitting error code instead of a bare
I left the servlet-context guard on For the test: added |
[This PR was co-authored with Claude Code. -Joe]
Summary
Adds
response:stream-binary-resource($binary-resource-path, $content-type, $filename?), which streams a stored binary resource straight from the database to the servlet response output stream — without ever materializing it in the JVM heap.response:stream-binary($binary, …)takes a fully-loadedxs:base64Binary, so the caller must first read the whole resource into memory (e.g.util:binary-doc()). For a large download that means the entire resource is materialized on the heap before a single byte goes out. The new function opens the storedBinaryDocumentand copies it directly to the response viabroker.readBinaryResource(Txn, BinaryDocument, OutputStream)— the same zero-copy pathRESTServeralready uses for binary downloads (RESTServer.java:1782–1783).Signature
How it relates to the existing
response:streamersresponse:stream($node, $opts)response:stream-binary($binary, $type, $f?)xs:base64Binarybinary.streamBinaryTo(os)response:stream-binary-resource($path, $type, $f?)(new)BinaryDocument,broker.readBinaryResource(txn, binDoc, os)What changed
response/StreamBinaryResource.java(new) — the function. Resolves the path (read lock + permission check, the same wayutil:binary-docdoes), verifies it's a binary document, setsContent-Type(andContent-Dispositionfrom$filename), and streams zero-copy within a transaction.response/ResponseModule.java— registers the two arities.RestBinariesTest.java—streamBinaryResourceRawstores a binary and asserts the bytes returned over a real HTTP request are byte-identical (mirrors the existingstreamBinaryRawtest).Test
RestBinariesTest— 4/4 green (incl. the new byte-identical case). exist-core builds clean; Codacy PMD clean.Scope: this is db paths only (
file:is deliberately out of scope)The function operates on database binary resources, not local filesystem files. Nothing inherent prevents a
file:variant — streaming a local file is actually simpler (no broker, lock, transaction, orBinaryDocument: justFiles.newInputStream(path).transferTo(os)). It's left out on purpose, for two reasons:response:stream-binary-resource('file:///etc/passwd', …)would stream arbitrary server files to a client. eXist treatsfile:access from XQuery as a DBA-only boundary, so afile:branch would need that guard and warrants its own scrutiny.file:variant follow the convention PR [feature] Support file: URIs in fn:collection() for filesystem directory querying #6192 is establishing forfn:collectionrather than inventing a parallel one.If
file:support were wanted, the shape would mirror #6192'sfn:collection: branch on the URI scheme (db vsfile:), restrict thefile:branch to DBA users (a security boundary, consistent withfn:doc()/ #6192), and stream viaFiles.newInputStream(...).transferTo(response.getOutputStream())— zero-copy, no materialization. That fills a real gap (the EXPathfile:read-binarymaterializes the whole file), but it belongs in its own PR.Context
This is the download half of the binary-streaming work surfaced by the existdb-openapi binary-transport track (eXist-db/existdb-openapi#35 / #38). Functional binary transport already works today via
response:stream-binary; this is the scale primitive for large downloads. The upload counterpart (a non-materializingrequest:get-input-stream()so a handler can pipe a large upload tobroker.storeDocument) is deliberately a separate follow-up — it depends on how the request body is consumed (single-read; the Roaster raw-body path) and needs its own investigation.