Commit 4995fc6
authored
bugfix/fix ndjson detection (Unstructured-IO#3905)
### Description
NDJSON files were being detected as JSON due to having the same
mime-type. This adds additional logic to skip mime-type based detection
if extension is `.ndjson`1 parent 85bfb1b commit 4995fc6
File tree
6 files changed
+57
-15
lines changed- test_unstructured/file_utils
- unstructured
- file_utils
- nlp
6 files changed
+57
-15
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
| 12 | + | |
11 | 13 | | |
12 | 14 | | |
13 | 15 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
80 | | - | |
81 | 80 | | |
82 | 81 | | |
83 | 82 | | |
| |||
116 | 115 | | |
117 | 116 | | |
118 | 117 | | |
119 | | - | |
120 | 118 | | |
121 | 119 | | |
122 | 120 | | |
| |||
154 | 152 | | |
155 | 153 | | |
156 | 154 | | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
161 | 159 | | |
162 | 160 | | |
163 | 161 | | |
| |||
268 | 266 | | |
269 | 267 | | |
270 | 268 | | |
271 | | - | |
272 | 269 | | |
273 | 270 | | |
274 | 271 | | |
| |||
333 | 330 | | |
334 | 331 | | |
335 | 332 | | |
| 333 | + | |
336 | 334 | | |
337 | 335 | | |
338 | 336 | | |
| |||
395 | 393 | | |
396 | 394 | | |
397 | 395 | | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
398 | 417 | | |
399 | 418 | | |
400 | 419 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
| 49 | + | |
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
| |||
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
143 | | - | |
144 | | - | |
| 143 | + | |
145 | 144 | | |
146 | 145 | | |
147 | 146 | | |
| |||
179 | 178 | | |
180 | 179 | | |
181 | 180 | | |
182 | | - | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
183 | 186 | | |
184 | 187 | | |
185 | 188 | | |
| |||
210 | 213 | | |
211 | 214 | | |
212 | 215 | | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
213 | 230 | | |
214 | 231 | | |
215 | 232 | | |
| |||
240 | 257 | | |
241 | 258 | | |
242 | 259 | | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
243 | 263 | | |
244 | 264 | | |
245 | 265 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
85 | | - | |
| 85 | + | |
| 86 | + | |
86 | 87 | | |
87 | 88 | | |
88 | 89 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
123 | | - | |
124 | 123 | | |
125 | 124 | | |
126 | 125 | | |
| |||
133 | 132 | | |
134 | 133 | | |
135 | 134 | | |
| 135 | + | |
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
| |||
0 commit comments