Skip to content

Conversation

@rbiseck3
Copy link
Contributor

@rbiseck3 rbiseck3 commented Feb 5, 2025

Description

NDJSON files were being detected as JSON due to having the same mime-type. This adds additional logic to skip mime-type based detection if extension is .ndjson

Copy link
Contributor

@scanny scanny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose moving this lower down, otherwise LGTM.

Let me know and I can approve it then.

@scanny
Copy link
Contributor

scanny commented Feb 5, 2025

Also @rbiseck3 I'd like to see a test for this. Good chance to fill in for the other possible cases, like when it comes with an asserted content-type.

@scanny
Copy link
Contributor

scanny commented Feb 5, 2025

The test suite is well developed, you should be able to do most of it with just adding a case to parameterized tests.

@rbiseck3 rbiseck3 force-pushed the roman/fix-ndjson-detection branch from ced1e8e to 0159ec4 Compare February 5, 2025 19:02
@rbiseck3 rbiseck3 requested a review from scanny February 5, 2025 20:24
Copy link
Contributor

@scanny scanny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@rbiseck3 rbiseck3 force-pushed the roman/fix-ndjson-detection branch from 1425726 to cd1dc47 Compare February 6, 2025 16:47
@rbiseck3 rbiseck3 force-pushed the roman/fix-ndjson-detection branch from a59df4f to 665d3d5 Compare February 10, 2025 17:32
@rbiseck3 rbiseck3 added this pull request to the merge queue Feb 11, 2025
Merged via the queue into main with commit 92be4eb Feb 11, 2025
41 checks passed
@rbiseck3 rbiseck3 deleted the roman/fix-ndjson-detection branch February 11, 2025 14:57
temp-adelyn pushed a commit to temp-adelyn/unstructured that referenced this pull request Mar 3, 2025
### Description
NDJSON files were being detected as JSON due to having the same
mime-type. This adds additional logic to skip mime-type based detection
if extension is `.ndjson`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants