Skip to content

[ntuple] Heuristically reduce memory usage of buffered writing #18314

Open
@hahnjo

Description

@hahnjo

At the moment, if implicit multi-threading is turned on, RPageSinkBuf::CommitPage will unconditionally copy the (uncompressed) page and create a task for sealing:

// TODO avoid frequent (de)allocations by holding on to allocated buffers in RColumnBuf
zipItem.fPage = fPageAllocator->NewPage(page.GetElementSize(), page.GetNElements());
// make sure the page is aware of how many elements it will have
zipItem.fPage.GrowUnchecked(page.GetNElements());
memcpy(zipItem.fPage.GetBuffer(), page.GetBuffer(), page.GetNBytes());
fCounters->fParallelZip.SetValue(1);
// Thread safety: Each thread works on a distinct zipItem which owns its
// compression buffer.
fTaskScheduler->AddTask([this, &zipItem, &sealedPage, &element, allocateBuf, shrinkSealedPage] {

This leads to increased memory usage for high compression ratios when no spare threads are available or too slow to keep up consuming the tasks. Heuristically, we could keep track of the total number of "queued" bytes and seal a page immediately if it is higher than a threshold (approx. zipped cluster size?). This should always leave enough work for other threads to pick up, while reducing memory usage because only the compressed page is kept.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions