Skip to content
This repository was archived by the owner on Sep 28, 2022. It is now read-only.

Commit 391eb01

Browse files
committed
fix and comment for importvalues
1 parent bfe8680 commit 391eb01

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed

Diff for: gpexp/importbatch.go

+17
Original file line numberDiff line numberDiff line change
@@ -522,13 +522,30 @@ func (b *Batch) importValueData() error {
522522

523523
// trim out null values from ids and values.
524524
nullIndices := b.nullIndices[field]
525+
// TODO(jaffee) I think this may be very inefficient. It looks
526+
// like we're copying the `ids` and `values` slices over
527+
// themselves (an O(n) operation) for each nullIndex so this
528+
// is effectively O(n^2). What we could do is iterate through
529+
// ids and values each once, while simultaneously iterating
530+
// through nullindices and keeping track of how many
531+
// nullIndices we've passed, and so how far back we need to
532+
// copy each item.
533+
//
534+
// It was a couple weeks ago that I wrote this code, and I
535+
// vaugely remember thinking about this, so I may just be
536+
// missing something now. We should benchmark on what should
537+
// be a bad case (an int field which is mostly null), and see
538+
// if the improved implementation helps a lot.
525539
for i, nullIndex := range nullIndices {
526540
nullIndex -= uint64(i) // offset the index by the number of items removed so far
527541
ids = append(ids[:nullIndex], ids[nullIndex+1:]...)
528542
values = append(values[:nullIndex], values[nullIndex+1:]...)
529543
}
530544

531545
// now do imports by shard
546+
if len(ids) == 0 {
547+
continue // TODO test this "all nil" case
548+
}
532549
curShard := ids[0] / shardWidth
533550
startIdx := 0
534551
for i := 1; i <= len(ids); i++ {

0 commit comments

Comments
 (0)