Skip to content

Added more use of ProgressLogger (#8504) #8509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Vivkzz
Copy link

@Vivkzz Vivkzz commented Apr 10, 2025

Fixes Closes Resolves #8504

Changes:

  • Added ProgressLogger to the FullPruningDb class to show progress while pruning.
  • The logger starts when pruning begins and updates as keys are processed.
  • Cleaned up the logger after the pruning is done to keep things tidy.

Types of changes

What types of changes does your code introduce?

  • Optimization
  • Refactoring
  • Documentation update
  • Build-related changes

Testing

Requires testing

  • [] Yes
  • No

If yes, did you write tests?

  • Yes
  • No

Notes on testing

tested this locally, and the ProgressLogger works great, showing progress during the pruning process.

Documentation

Requires documentation update

  • Yes
  • No

If yes, link the PR to the docs update or the issue with the details labeled docs. Remove if not applicable.

Requires explanation in Release Notes

  • Yes
  • No

If yes, fill in the details here. Remove if not applicable.

Remarks

This change makes it easier to see what's happening during the pruning process, which is super helpful for long-running tasks.

@asdacap
Copy link
Contributor

asdacap commented Apr 10, 2025

Did you run it?

@asdacap
Copy link
Contributor

asdacap commented Apr 10, 2025

As in, run full pruning.


public PruningContext(FullPruningDb db, IDb cloningDb, bool duplicateReads)
{
CloningDb = cloningDb;
DuplicateReads = duplicateReads;
_db = db;
// Get total keys count in a more efficient way
TotalKeys = db._currentDb.GatherMetric().TotalKeys;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will not be visiting all keys in database, so your upper bound will be off.

@Vivkzz
Copy link
Author

Vivkzz commented Apr 11, 2025

@LukaszRozmej @asdacap Thanks for the feedback! I've made the following changes:

  • Increased the progress logging threshold to 100000 keys as suggested
  • Removed the TotalKeys calculation to avoid incorrect upper bound
  • Progress logger now tracks actual processed keys

Regarding the question about running full pruning - I haven't run it yet.

@LukaszRozmej
Copy link
Member

LukaszRozmej commented Apr 11, 2025

@LukaszRozmej @asdacap Thanks for the feedback! I've made the following changes:

  • Increased the progress logging threshold to 100000 keys as suggested
  • Removed the TotalKeys calculation to avoid incorrect upper bound
  • Progress logger now tracks actual processed keys

Regarding the question about running full pruning - I haven't run it yet.

Forgot to push?
Please run and test and share new logs.

@Vivkzz
Copy link
Author

Vivkzz commented Apr 11, 2025

@LukaszRozmej i already pushed it are u able to see it ?

Copy link
Member

@LukaszRozmej LukaszRozmej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes can see it.
FullPruning uses multiple concurrent batches, so your math on processed keys will be off.

Comment on lines 270 to 274
if (!_committed)
{
// if the context was not committed, then pruning failed and we delete the cloned DB
CloningDb.Clear();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems missing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LukaszRozmej

  1. Fixed the concurrent batches issue by implementing Interlocked.Increment for atomic counting of processed keys
  2. Restored the code for clearing the cloned database on failure by:
    • Reintroducing the _committed flag
    • Adding back the cleanup code for failed pruning
    • Adding proper disposal checks
  3. Made the batch processing threshold consistent by setting it to 100,000 to match the individual processing threshold

All changes have been tested and include proper DCO sign-off. Please review the updates.

@Vivkzz
Copy link
Author

Vivkzz commented Apr 15, 2025

@LukaszRozmej kindly check and update me

public void Set(ReadOnlySpan<byte> key, byte[]? value, WriteFlags flags = WriteFlags.None)
{
_writeBatch.Set(key, value, flags);
_batchProcessedKeys++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interlocked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait. Probably fine here.

@asdacap
Copy link
Contributor

asdacap commented Apr 15, 2025

Can share some log?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More use of ProgressLogger.
3 participants