Skip to content

Commit e0e5f30

Browse files
committed
fix duplicate row detection bug
occurred when a duplicate row existed only in one of the two source files
1 parent daccf24 commit e0e5f30

File tree

4 files changed

+437
-34
lines changed

4 files changed

+437
-34
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,8 @@ while loading at most two rows of data in memory.
3535

3636
If your data is not already sorted, you can use my other lib https://github.com/livetocode/huge-csv-sorter, which can sort a huge file very efficiently thanks to SQLite.
3737

38-
This allows us to diff two 600MB files containing 2.6 millions of rows and 37 columns in 22 seconds on my MacBook Pro.
39-
Or two 250 MB files containing 4 millions of rows and 7 columns in 12 seconds.
38+
This allows us to diff two 600MB files containing 2.6 millions of rows and 37 columns in 18 seconds on my MacBook Pro.
39+
Or two 250 MB files containing 4 millions of rows and 7 columns in 10 seconds.
4040

4141
# Features
4242

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "tabular-data-differ",
3-
"version": "1.0.0",
3+
"version": "1.0.1",
44
"description": "A very efficient library for diffing two sorted streams of tabular data, such as CSV files.",
55
"keywords": [
66
"table",

0 commit comments

Comments
 (0)