Skip to content

Commit 79f5557

Browse files
tdurieuxclaude
andcommitted
improve binary file detection: content sniffing + jsonl support
Files like .jsonl that mime-types doesn't know fell through to application/octet-stream and rendered as "Unsupported binary file" in the viewer. Replace istextorbinary with isbinaryfile for content-based detection, and use mime-types for name-based classification with a textual application/* allowlist. The streaming transformer now defers classification when the name is inconclusive and sniffs the first chunk before emitting "transform", so route.ts and AnonymizedFile.ts get a content-aware Content-Type. Whitelists .jsonl and .ndjson to short-circuit dataset files. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 18ce39e commit 79f5557

6 files changed

Lines changed: 154 additions & 158 deletions

File tree

package-lock.json

Lines changed: 18 additions & 109 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@
5151
"express-slow-down": "^2.0.1",
5252
"got": "^11.8.6",
5353
"inquirer": "^8.2.6",
54-
"istextorbinary": "^9.5.0",
54+
"isbinaryfile": "^6.0.0",
5555
"marked": "^5.1.2",
5656
"mime-types": "^2.1.35",
5757
"mongoose": "^7.6.10",

src/config.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,8 @@ const config: Config = {
6969
"out",
7070
"sol",
7171
"in",
72+
"jsonl",
73+
"ndjson",
7274
],
7375
STORAGE: "filesystem",
7476
STREAMER_ENTRYPOINT: null,

0 commit comments

Comments
 (0)