Skip to content

Feat[Storagetool]: cluster state ledger (CSL) file search support #558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 85 commits into
base: main
Choose a base branch
from

Conversation

alexander-e1off
Copy link
Collaborator

@alexander-e1off alexander-e1off commented Jan 8, 2025

Enhance storage tool to search records in cluster state ledger (CSL) file.

To search records in CSL file, only CSL file path (--csl-file option) must be passed, journal file path (--journal-path/--journal-path` options must not be present.

Scenarios of BMQStorageTool usage for CSL file

Output summary for CSL file

Example:

bmqstoragetool --csl-file=<path> --summary

Output records from the beginning of CSL file

Example:

bmqstoragetool --csl-file=<path> --csl-from-begin

NOTE: by default search is done from the latest snapshot.

Search and otput only desired record types in CSL file

Example:

bmqstoragetool --csl-file=<path> --csl-record-type=snapshot --csl-record-type=update

NOTE: snapshot, update, commit and ack are supported. Without this option all record types are selected.

Search and otput records details in CSL file

Example:

bmqstoragetool --csl-file=<path> --details

The same search filters could be applied to above scenarios as for journal file search (except --guid filter).

Example of short result output

[ recordType = SNAPSHOT electorTerm = 401 sequenceNumber = 25 timestamp = 1706022776 ][ logId = 37146FB620 offset = 23492436 ]
[ recordType = UPDATE electorTerm = 401 sequenceNumber = 27 timestamp = 1706029885 ][ logId = 37146FB620 offset = 23503460 ]
1 snapshot record(s) found.
1 update record(s) found.
No commit record(s) found.
No ack record(s) found.

Example of detail result output

===================================

    RecordType           : SNAPSHOT
    Offset               : 388
    LogId                : 87EDF15DC0
    ElectorTerm          : 1
    SequenceNumber       : 1
    HeaderWords          : 8
    LeaderAdvisoryWords  : 30
    Timestamp            : 29OCT2024_14:02:54.000000
    Epoch                : 1730210574
    Record               :     [
      choice = [
        leaderAdvisory = [
          sequenceNumber = [
            electorTerm = 1
            sequenceNumber = 1
          ]
          partitions = [
            [
              partitionId = 0
              primaryNodeId = 0
              primaryLeaseId = 2
            ]
          ]
          queues = [
            [
              uri = "bmq://bmq.test.persistent.priority/my-first-queue"
 key = [ 26DACDC974 ]              partitionId = 0
              appIds = [
              ]
            ]
          ]
        ]
      ]
    ]

=================================

    RecordType           : COMMIT
    Offset               : 540
    LogId                : 87EDF15DC0
    ElectorTerm          : 1
    SequenceNumber       : 2
    HeaderWords          : 8
    LeaderAdvisoryWords  : 10
    Timestamp            : 29OCT2024_14:02:54.000000
    Epoch                : 1730210574
    Record               :     [
      choice = [
        leaderAdvisoryCommit = [
          sequenceNumber = [
            electorTerm = 1
            sequenceNumber = 2
          ]
          sequenceNumberCommitted = [
            electorTerm = 1
            sequenceNumber = 1
          ]
        ]
      ]
    ]

1 snapshot record(s) found.
No update record(s) found.
1 commit record(s) found.
No ack record(s) found.

Example of summary result output

1 snapshot record(s) found.

2 update record(s) found, including:
    queueAssignmentAdvisory  : 1
    queueUnassignedAdvisory  : 1

3 commit record(s) found.

No ack record(s) found.

2 Queues found:
[ uri = "bmq://bmq.test.persistent.fanout/my2" key = [ 676EC94E89 ]
 partitionId = 0 appIds = [ [ appId = "foo" appKey = [ 40D805FBE4 ] ] [ appId = "bar" appKey = [ 8377729B23 ] ] [ appId = "baz" appKey = [ 0C996AF949 ] ] ] ]
[ uri = "bmq://bmq.test.persistent.fanout/my1" key = [ 44D469EF30 ]
 partitionId = 0 appIds = [ [ appId = "bar" appKey = [ 9FC222C73E ] ] [ appId = "baz" appKey = [ 935B5C5FC4 ] ] [ appId = "foo" appKey = [ 489D904F25 ] ] ] ]

Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
Signed-off-by: Aleksandr Ivanov <[email protected]>
@alexander-e1off alexander-e1off marked this pull request as ready for review March 6, 2025 11:00
@alexander-e1off alexander-e1off requested a review from a team as a code owner March 6, 2025 11:00
@alexander-e1off alexander-e1off requested a review from 678098 March 6, 2025 11:01
Signed-off-by: Aleksandr Ivanov <[email protected]>
[-h|help]
Where:
-r | --record-type <record type>
record type to search {message|queue-op|journal-op} (default: message)
record type to search {<message>|queue-op|journal-op} (default: message)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need angle brackets for message?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just follow the style of bmqtool to highlight default value in case when it is not explicitly defined in balcl::OptionInfo.

{
d_fields.clear();
d_fields.reserve(9); // max number of fields
d_fields.push_back("RecordType");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using emplace_back

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just follow common bmqu::AlignedPrinter usage, used in codebase. d_fields is std::vector<const char*> type, and in this case push_back and emplace_back are equivalent - both just store the pointer.

Copy link
Collaborator

@678098 678098 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed a part of the PR, have comments

Comment on lines 5 to 9
files. It allows to search records in:
- `journal` file (.bmq_journal) with the set of different filters and output found results in
the short or detail form;
- `cluster state ledger` (CSL) file (*.bmq_csl) with the set of different filters and output
found results in the short or detail form;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current form of explanation is a bit too heavy and should be simplified and disambiguated:

Suggested change
files. It allows to search records in:
- `journal` file (.bmq_journal) with the set of different filters and output found results in
the short or detail form;
- `cluster state ledger` (CSL) file (*.bmq_csl) with the set of different filters and output
found results in the short or detail form;
files. Using a set of different filters, it allows to search records in:
- `journal` file (.bmq_journal).
- `cluster state ledger` (CSL) file (*.bmq_csl).
The output results can be returned in either short or detailed form.

the short or detail form;
- `cluster state ledger` (CSL) file (*.bmq_csl) with the set of different filters and output
found results in the short or detail form;
NOTE: For this mode, `journal` file (.bmq_journal) **must not** be passed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a must, it is better to enforce it in the storage tool itself. Also this note can be merged with the next paragraph

Comment on lines 209 to 210
Filter messages with corresponding composite sequence numbers (defined in form `primaryLeaseId-sequenceNumber` for journal file or `electorTerm-sequenceNumber` for CSL file)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can make this header shorter by moving the note

Suggested change
Filter messages with corresponding composite sequence numbers (defined in form `primaryLeaseId-sequenceNumber` for journal file or `electorTerm-sequenceNumber` for CSL file)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Filter messages with corresponding composite sequence numbers
-------------------------------------------------------------
Composite sequence numbers are defined in form `primaryLeaseId-sequenceNumber` for journal file or `electorTerm-sequenceNumber` for CSL file.

Comment on lines 37 to 38
const bsls::Types::Uint64 leaseId,
const bsls::Types::Uint64 sequenceNumber)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const bsls::Types::Uint64 leaseId,
const bsls::Types::Uint64 sequenceNumber)
bsls::Types::Uint64 leaseId,
bsls::Types::Uint64 sequenceNumber)

Basic arithmetic types passed by value

Comment on lines 60 to 61
explicit CompositeSequenceNumber(const bsls::Types::Uint64 leaseId,
const bsls::Types::Uint64 sequenceNumber);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
explicit CompositeSequenceNumber(const bsls::Types::Uint64 leaseId,
const bsls::Types::Uint64 sequenceNumber);
explicit CompositeSequenceNumber(bsls::Types::Uint64 leaseId,
bsls::Types::Uint64 sequenceNumber);


public:
// CREATORS
CslRecordPrinter(bsl::ostream& stream, bslma::Allocator* allocator);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CslRecordPrinter(bsl::ostream& stream, bslma::Allocator* allocator);
explicit CslRecordPrinter(bsl::ostream& stream, bslma::Allocator* allocator = 0);
  • nested traits using bslma allocator

const mqbsi::LedgerRecordId& recId)
{
d_fields.clear();
d_fields.reserve(9); // max number of fields
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
d_fields.reserve(9); // max number of fields
d_fields.reserve(10); // max number of fields

There is 9 or 10 fields depending on if branch triggering


namespace BloombergLP {
namespace m_bmqstoragetool {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define destructors here for all classes defined in the header to avoid warnings during compilation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked build logs on CI for both ubuntu and darwin and see no warnings for these classes. CslRecordPrinter is a template class and should be defined in the header. As I know it make sense for virtual classes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked build logs on CI for both ubuntu and darwin and see no warnings for these classes. CslRecordPrinter is a template class and should be defined in the header. As I know it make sense for virtual classes.

Good point


public:
// CREATORS
HumanReadableCslPrinter(bsl::ostream& os, bslma::Allocator* allocator)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
HumanReadableCslPrinter(bsl::ostream& os, bslma::Allocator* allocator)
explicit HumanReadableCslPrinter(bsl::ostream& os, bslma::Allocator* allocator = 0)
  • nested trait uses bslma allocator

public:
// CREATORS

JsonCslPrinter(bsl::ostream& os, bslma::Allocator* allocator)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
JsonCslPrinter(bsl::ostream& os, bslma::Allocator* allocator)
explicit JsonCslPrinter(bsl::ostream& os, bslma::Allocator* allocator = 0)
  • nested trait uses bslma allocator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants