You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If your data sources contain duplicate keys, then the diffing will fail by default, but you can configure this behavior using the duplicateKeyHandling option.
325
+
326
+
You can resolve the conflict by keeping the first or last row of the duplicates:
327
+
```Typescript
328
+
import { diff } from'tabular-data-differ';
329
+
conststats=awaitdiff({
330
+
oldSource: './tests/a.csv',
331
+
newSource: './tests/b.csv',
332
+
keys: ['id'],
333
+
duplicateKeyHandling: 'keepFirstRow', // or 'keepLastRow'
334
+
duplicateRowBufferSize: 2000,
335
+
}).to('console');
336
+
console.log(stats);
337
+
```
338
+
339
+
Or, if you need more control in the row selection, then you can provide your own handler:
340
+
```Typescript
341
+
import { diff } from'tabular-data-differ';
342
+
conststats=awaitdiff({
343
+
oldSource: './tests/a.csv',
344
+
newSource: './tests/b.csv',
345
+
keys: ['id'],
346
+
duplicateKeyHandling: (rows) =>rows[0], // same as 'keepFirstRow'
347
+
duplicateRowBufferSize: 2000,
348
+
}).to('console');
349
+
console.log(stats);
350
+
```
351
+
352
+
297
353
### Order 2 CSV files and diff them on the console
298
354
299
355
Don't forget to install first my other lib: `npmihuge-csv-sorter`.
@@ -524,14 +580,16 @@ sortDirection| no | ASC | specifies if the column is sorted in ascen
oldSource | yes | | either a string filename, a URL or a SourceOptions
530
-
newSource | yes | | either a string filename, a URL or a SourceOptions
531
-
keys | yes | | the list of columns that form the primary key. This is required for comparing the rows. A key can be a string name or a {ColumnDefinition}
532
-
includedColumns | no | | the list of columns to keep from the input sources. If not specified, all columns are selected.
533
-
excludedColumns | no | | the list of columns to exclude from the input sources.
534
-
rowComparer | no | | specifies a custom row comparer.
oldSource | yes | | either a string filename, a URL or a SourceOptions
586
+
newSource | yes | | either a string filename, a URL or a SourceOptions
587
+
keys | yes | | the list of columns that form the primary key. This is required for comparing the rows. A key can be a string name or a {ColumnDefinition}
588
+
includedColumns | no | | the list of columns to keep from the input sources. If not specified, all columns are selected.
589
+
excludedColumns | no | | the list of columns to exclude from the input sources.
590
+
rowComparer | no | | specifies a custom row comparer.
591
+
duplicateKeyHandling |no | fail | specifies how to handle duplicate rows in a source. It will fail by default and throw a UniqueKeyViolationError exception. But you can ignore, keep the first or last row, or even provide your own function that will receive the duplicates and select the best candidate.
592
+
duplicateRowBufferSize|no | 1000 | specifies the maximum size of the buffer used to accumulate duplicate rows.
0 commit comments