Skip to content

Feat[Storagetool]: cluster state ledger (CSL) file search support #558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 91 commits into from
May 12, 2025

Conversation

alexander-e1off
Copy link
Collaborator

@alexander-e1off alexander-e1off commented Jan 8, 2025

Enhance storage tool to search records in cluster state ledger (CSL) file.

To search records in CSL file, only CSL file path (--csl-file option) must be passed, journal file path (--journal-path/--journal-path` options must not be present.

Scenarios of BMQStorageTool usage for CSL file

Output summary for CSL file

Example:

bmqstoragetool --csl-file=<path> --summary

Output records from the beginning of CSL file

Example:

bmqstoragetool --csl-file=<path> --csl-from-begin

NOTE: by default search is done from the latest snapshot.

Search and otput only desired record types in CSL file

Example:

bmqstoragetool --csl-file=<path> --csl-record-type=snapshot --csl-record-type=update

NOTE: snapshot, update, commit and ack are supported. Without this option all record types are selected.

Search and otput records details in CSL file

Example:

bmqstoragetool --csl-file=<path> --details

The same search filters could be applied to above scenarios as for journal file search (except --guid filter).

Example of short result output

[ recordType = SNAPSHOT electorTerm = 401 sequenceNumber = 25 timestamp = 1706022776 ][ logId = 37146FB620 offset = 23492436 ]
[ recordType = UPDATE electorTerm = 401 sequenceNumber = 27 timestamp = 1706029885 ][ logId = 37146FB620 offset = 23503460 ]
1 snapshot record(s) found.
1 update record(s) found.
No commit record(s) found.
No ack record(s) found.

Example of detail result output

===================================

    RecordType           : SNAPSHOT
    Offset               : 388
    LogId                : 87EDF15DC0
    ElectorTerm          : 1
    SequenceNumber       : 1
    HeaderWords          : 8
    LeaderAdvisoryWords  : 30
    Timestamp            : 29OCT2024_14:02:54.000000
    Epoch                : 1730210574
    Record               :     [
      choice = [
        leaderAdvisory = [
          sequenceNumber = [
            electorTerm = 1
            sequenceNumber = 1
          ]
          partitions = [
            [
              partitionId = 0
              primaryNodeId = 0
              primaryLeaseId = 2
            ]
          ]
          queues = [
            [
              uri = "bmq://bmq.test.persistent.priority/my-first-queue"
 key = [ 26DACDC974 ]              partitionId = 0
              appIds = [
              ]
            ]
          ]
        ]
      ]
    ]

=================================

    RecordType           : COMMIT
    Offset               : 540
    LogId                : 87EDF15DC0
    ElectorTerm          : 1
    SequenceNumber       : 2
    HeaderWords          : 8
    LeaderAdvisoryWords  : 10
    Timestamp            : 29OCT2024_14:02:54.000000
    Epoch                : 1730210574
    Record               :     [
      choice = [
        leaderAdvisoryCommit = [
          sequenceNumber = [
            electorTerm = 1
            sequenceNumber = 2
          ]
          sequenceNumberCommitted = [
            electorTerm = 1
            sequenceNumber = 1
          ]
        ]
      ]
    ]

1 snapshot record(s) found.
No update record(s) found.
1 commit record(s) found.
No ack record(s) found.

Example of summary result output

1 snapshot record(s) found.

2 update record(s) found, including:
    queueAssignmentAdvisory  : 1
    queueUnassignedAdvisory  : 1

3 commit record(s) found.

No ack record(s) found.

2 Queues found:
[ uri = "bmq://bmq.test.persistent.fanout/my2" key = [ 676EC94E89 ]
 partitionId = 0 appIds = [ [ appId = "foo" appKey = [ 40D805FBE4 ] ] [ appId = "bar" appKey = [ 8377729B23 ] ] [ appId = "baz" appKey = [ 0C996AF949 ] ] ] ]
[ uri = "bmq://bmq.test.persistent.fanout/my1" key = [ 44D469EF30 ]
 partitionId = 0 appIds = [ [ appId = "bar" appKey = [ 9FC222C73E ] ] [ appId = "baz" appKey = [ 935B5C5FC4 ] ] [ appId = "foo" appKey = [ 489D904F25 ] ] ] ]

Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Copy link
Collaborator

@678098 678098 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed a part of the PR, have comments

Comment on lines 5 to 9
files. It allows to search records in:
- `journal` file (.bmq_journal) with the set of different filters and output found results in
the short or detail form;
- `cluster state ledger` (CSL) file (*.bmq_csl) with the set of different filters and output
found results in the short or detail form;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current form of explanation is a bit too heavy and should be simplified and disambiguated:

Suggested change
files. It allows to search records in:
- `journal` file (.bmq_journal) with the set of different filters and output found results in
the short or detail form;
- `cluster state ledger` (CSL) file (*.bmq_csl) with the set of different filters and output
found results in the short or detail form;
files. Using a set of different filters, it allows to search records in:
- `journal` file (.bmq_journal).
- `cluster state ledger` (CSL) file (*.bmq_csl).
The output results can be returned in either short or detailed form.

the short or detail form;
- `cluster state ledger` (CSL) file (*.bmq_csl) with the set of different filters and output
found results in the short or detail form;
NOTE: For this mode, `journal` file (.bmq_journal) **must not** be passed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a must, it is better to enforce it in the storage tool itself. Also this note can be merged with the next paragraph

Comment on lines 209 to 210
Filter messages with corresponding composite sequence numbers (defined in form `primaryLeaseId-sequenceNumber` for journal file or `electorTerm-sequenceNumber` for CSL file)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can make this header shorter by moving the note

Suggested change
Filter messages with corresponding composite sequence numbers (defined in form `primaryLeaseId-sequenceNumber` for journal file or `electorTerm-sequenceNumber` for CSL file)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Filter messages with corresponding composite sequence numbers
-------------------------------------------------------------
Composite sequence numbers are defined in form `primaryLeaseId-sequenceNumber` for journal file or `electorTerm-sequenceNumber` for CSL file.

Comment on lines 37 to 38
const bsls::Types::Uint64 leaseId,
const bsls::Types::Uint64 sequenceNumber)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const bsls::Types::Uint64 leaseId,
const bsls::Types::Uint64 sequenceNumber)
bsls::Types::Uint64 leaseId,
bsls::Types::Uint64 sequenceNumber)

Basic arithmetic types passed by value

Comment on lines 60 to 61
explicit CompositeSequenceNumber(const bsls::Types::Uint64 leaseId,
const bsls::Types::Uint64 sequenceNumber);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
explicit CompositeSequenceNumber(const bsls::Types::Uint64 leaseId,
const bsls::Types::Uint64 sequenceNumber);
explicit CompositeSequenceNumber(bsls::Types::Uint64 leaseId,
bsls::Types::Uint64 sequenceNumber);


public:
// CREATORS
CslRecordPrinter(bsl::ostream& stream, bslma::Allocator* allocator);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CslRecordPrinter(bsl::ostream& stream, bslma::Allocator* allocator);
explicit CslRecordPrinter(bsl::ostream& stream, bslma::Allocator* allocator = 0);
  • nested traits using bslma allocator

const mqbsi::LedgerRecordId& recId)
{
d_fields.clear();
d_fields.reserve(9); // max number of fields
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
d_fields.reserve(9); // max number of fields
d_fields.reserve(10); // max number of fields

There is 9 or 10 fields depending on if branch triggering


namespace BloombergLP {
namespace m_bmqstoragetool {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define destructors here for all classes defined in the header to avoid warnings during compilation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked build logs on CI for both ubuntu and darwin and see no warnings for these classes. CslRecordPrinter is a template class and should be defined in the header. As I know it make sense for virtual classes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked build logs on CI for both ubuntu and darwin and see no warnings for these classes. CslRecordPrinter is a template class and should be defined in the header. As I know it make sense for virtual classes.

Good point


public:
// CREATORS
HumanReadableCslPrinter(bsl::ostream& os, bslma::Allocator* allocator)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
HumanReadableCslPrinter(bsl::ostream& os, bslma::Allocator* allocator)
explicit HumanReadableCslPrinter(bsl::ostream& os, bslma::Allocator* allocator = 0)
  • nested trait uses bslma allocator

public:
// CREATORS

JsonCslPrinter(bsl::ostream& os, bslma::Allocator* allocator)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
JsonCslPrinter(bsl::ostream& os, bslma::Allocator* allocator)
explicit JsonCslPrinter(bsl::ostream& os, bslma::Allocator* allocator = 0)
  • nested trait uses bslma allocator

@kaikulimu
Copy link
Collaborator

I will give it a try and leave comments

@kaikulimu kaikulimu self-requested a review May 7, 2025 13:39
@kaikulimu
Copy link
Collaborator

kaikulimu commented May 7, 2025

@alexander-e1off Some feedback:

Summary has bug with filters

Say a CSL file has an initial snapshot record with zero queues, followed by two queue assignment update records. All records have corresponding commit. If the user runs /bmqstoragetool.tsk --csl-file=<path> --csl-record-type=snapshot --summary, it will show the following:

1 snapshot record(s) found.

2 Queues found:
[ uri = "bmq://bmq.test.mmap.fanout/q2" key = [ 6468D6506E ]
 partitionId = 1 appIds = [ ] ]
[ uri = "bmq://bmq.test.mmap.fanout/q1" key = [ 18F848CC18 ]
 partitionId = 0 appIds = [ ] ]

It says 2 queues found in the summary, but those information did not come solely from the snapshot-type records. We observe similar issue if we run /bmqstoragetool.tsk --csl-file=<path> --csl-record-type=ack --summary:

No ack record(s) found.

2 Queues found:
[ uri = "bmq://bmq.test.mmap.fanout/q2" key = [ 6468D6506E ]
 partitionId = 1 appIds = [ ] ]
[ uri = "bmq://bmq.test.mmap.fanout/q1" key = [ 18F848CC18 ]
 partitionId = 0 appIds = [ ] ]

This is confusing to the user. We have two options:

  1. (Easier but sufficient) Prohibit filtering by record type, and maybe some other filters, while we are in summary mode. Summaries should be short enough that they won't be too long even if filters are prohibited.
  2. Enhance summary mode to show summary over only the filtered in records.

@kaikulimu kaikulimu assigned alexander-e1off and unassigned 678098 May 7, 2025
Copy link
Collaborator

@kaikulimu kaikulimu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment

@alexander-e1off
Copy link
Collaborator Author

@kaikulimu, thanks for your feedback. Agree that summary should cover all content of CSL file. I've fixed it with option 1:

./bmqstoragetool.tsk --csl-file=<path> --summary --csl-record-type=snapshot
Arguments validation failed:
'--summary' can't be combined with '--details' and filters (record type, range, queue name/key, offset, seq.num) options, as it calculates and outputs statistics

kaikulimu
kaikulimu previously approved these changes May 9, 2025
Copy link
Collaborator

@kaikulimu kaikulimu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kaikulimu
Copy link
Collaborator

@678098 Your review status is "Requested changes", so we can't merge this PR yet. Please review again, and approve and merge if everything looks good.

@kaikulimu kaikulimu merged commit 214d138 into bloomberg:main May 12, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants