feat(clp-s-ffi-sfa): Expose file list and range info metadata in ClpArchiveReader; Refactor and clean up unit tests.#2093
Conversation
WalkthroughPrecomputation of archive metadata was moved out of the constructor and made to return Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/ffi/sfa/ClpArchiveReader.cpp`:
- Around line 133-142: Loop is currently emitting one FileInfo per range
fragment; instead detect split fragments using the archive constant
(clp_s::archive_constants::_file_split_number) and coalesce by filename: for
each range (range.fields, key clp_s::constants::range_index::cFilename) check if
a FileInfo for that filename already exists in m_file_infos and if so update its
start_idx = min(existing.start_idx, start_idx) and end_idx =
max(existing.end_idx, end_idx) (and do not push a duplicate into m_file_names),
otherwise emplace a new FileInfo as currently done; ensure detection uses the
split-number field when present so fragments are merged into a single logical
file entry.
In `@components/core/src/clp_s/ffi/sfa/ClpArchiveReader.hpp`:
- Around line 17-42: The header uses std::string (members and return type in
class FileInfo and method get_file_name which returns std::string const&
referencing m_file_name) but doesn't include <string>; add the missing `#include`
<string> at the top of ClpArchiveReader.hpp so the declaration of std::string
and uses of m_file_name/get_file_name compile without relying on transitive
includes.
In `@components/core/tests/test-clp_s-ffi_sfa_reader.cpp`:
- Around line 77-88: The tests currently only assert internal consistency
between reader.get_file_names() and reader.get_file_infos() but do not verify
the actual expected filenames or ranges; update the test to assert concrete
expectations for each fixture: replace the non-empty and equality checks with
assertions that the sizes equal the expected number of files, then iterate over
reader.get_file_names() / reader.get_file_infos() and assert file_name ==
expected_filenames[i], file_info.get_file_name() == expected_filenames[i],
file_info.get_start_index() == expected_ranges[i].start,
file_info.get_end_index() == expected_ranges[i].end (or expected_event_count
when appropriate), and file_info.get_event_count() == expected_ranges[i].count
so multi-file ordering and ranges are validated explicitly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 74dbdec3-ee2e-4ba0-a30a-f56f11d3494f
📒 Files selected for processing (3)
components/core/src/clp_s/ffi/sfa/ClpArchiveReader.cppcomponents/core/src/clp_s/ffi/sfa/ClpArchiveReader.hppcomponents/core/tests/test-clp_s-ffi_sfa_reader.cpp
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/core/tests/test-clp_s-ffi_sfa_reader.cpp (1)
43-58:⚠️ Potential issue | 🟡 MinorClear the per-fixture output directory before recompressing.
This helper reuses a deterministic
output_dirbased onlog_path.stem(), but it never removes stale contents under that path. A dirty or interrupted prior run can therefore influence the next archive and hide regressions. Removing the directory first keeps the fixture hermetic.Based on learnings, files and directories are intentionally removed at the beginning of CLP tests so existing content cannot influence the result.💡 Proposed fix
auto const output_dir{root_output_dir / log_path.stem().string()}; +std::filesystem::remove_all(output_dir); auto const archive_stats = compress_archive(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/tests/test-clp_s-ffi_sfa_reader.cpp` around lines 43 - 58, The helper generate_single_file_archive reuses a deterministic output_dir (derived from get_archive_output_root_dir() and log_path.stem()) but doesn't clear previous contents, so remove any stale data before recompressing: call std::filesystem::remove_all(output_dir) (handle errors/exceptions as appropriate) and then recreate the directory (std::filesystem::create_directories(output_dir)) prior to calling compress_archive; reference output_dir and generate_single_file_archive to locate the change.
♻️ Duplicate comments (1)
components/core/tests/test-clp_s-ffi_sfa_reader.cpp (1)
81-92: 🛠️ Refactor suggestion | 🟠 MajorAdd at least one real multi-file archive case.
These checks still only exercise the one-file path, so the new ordering and non-zero start-index behaviour for later files never executes here. A regression in per-file range accumulation or file ordering would still pass. Please add a fixture that packs multiple input files into one archive and assert each filename/range explicitly.
Also applies to: 125-140
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/tests/test-clp_s-ffi_sfa_reader.cpp` around lines 81 - 92, The test only covers the single-file path so it doesn't exercise multi-file ordering or non-zero start-index behavior; update the test that uses reader by adding a fixture that packs multiple input files into one archive, invoke reader.get_file_names() and reader.get_file_infos(), assert the returned sizes > 1, then iterate the files asserting the expected file name ordering and per-file ranges using file_info.get_start_index(), file_info.get_end_index(), and file_info.get_event_count() for each file to verify accumulated start/end indices and counts match the inputs; ensure at least one case has a non-zero start index to catch regressions in per-file range accumulation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/ffi/sfa/ClpArchiveReader.cpp`:
- Around line 32-34: Update the docblocks for the ClpArchiveReader factory API
(both create() variants that now call precompute_archive_metadata()) to document
that callers may receive the new MalformedRangeIndex failure in addition to the
previously listed errors; specifically mention MalformedRangeIndex alongside
existing error cases in the header comments for the create() paths and note that
it is propagated from ClpArchiveReader::precompute_archive_metadata().
---
Outside diff comments:
In `@components/core/tests/test-clp_s-ffi_sfa_reader.cpp`:
- Around line 43-58: The helper generate_single_file_archive reuses a
deterministic output_dir (derived from get_archive_output_root_dir() and
log_path.stem()) but doesn't clear previous contents, so remove any stale data
before recompressing: call std::filesystem::remove_all(output_dir) (handle
errors/exceptions as appropriate) and then recreate the directory
(std::filesystem::create_directories(output_dir)) prior to calling
compress_archive; reference output_dir and generate_single_file_archive to
locate the change.
---
Duplicate comments:
In `@components/core/tests/test-clp_s-ffi_sfa_reader.cpp`:
- Around line 81-92: The test only covers the single-file path so it doesn't
exercise multi-file ordering or non-zero start-index behavior; update the test
that uses reader by adding a fixture that packs multiple input files into one
archive, invoke reader.get_file_names() and reader.get_file_infos(), assert the
returned sizes > 1, then iterate the files asserting the expected file name
ordering and per-file ranges using file_info.get_start_index(),
file_info.get_end_index(), and file_info.get_event_count() for each file to
verify accumulated start/end indices and counts match the inputs; ensure at
least one case has a non-zero start index to catch regressions in per-file range
accumulation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 18eb316c-5297-4226-af65-989b3e528e79
📒 Files selected for processing (5)
components/core/src/clp_s/ffi/sfa/ClpArchiveReader.cppcomponents/core/src/clp_s/ffi/sfa/ClpArchiveReader.hppcomponents/core/src/clp_s/ffi/sfa/SfaErrorCode.cppcomponents/core/src/clp_s/ffi/sfa/SfaErrorCode.hppcomponents/core/tests/test-clp_s-ffi_sfa_reader.cpp
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/ffi/sfa/ClpArchiveReader.hpp`:
- Around line 92-102: The two accessor methods have inconsistent return
semantics: get_file_names() returns a vector by value while get_file_infos()
returns a const reference, which can dangle if close() clears
m_file_names/m_file_infos; make them consistent by returning the same type for
both—prefer returning by value for safety (change get_file_infos() to return
std::vector<FileInfo> by value) or, if you choose efficiency, return const& for
both (change get_file_names() to return std::vector<std::string> const&), and
update callers accordingly; refer to get_file_names(), get_file_infos(),
close(), and members m_file_names/m_file_infos when applying the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 000ac3b6-3d8c-4b4b-bd2b-e5e08171c181
📒 Files selected for processing (1)
components/core/src/clp_s/ffi/sfa/ClpArchiveReader.hpp
| /** | ||
| * @return Source file names in range-index order. | ||
| */ | ||
| [[nodiscard]] auto get_file_names() const -> std::vector<std::string> { return m_file_names; } | ||
|
|
||
| /** | ||
| * @return Source file metadata in range index order. | ||
| */ | ||
| [[nodiscard]] auto get_file_infos() const -> std::vector<FileInfo> const& { | ||
| return m_file_infos; | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider making return types consistent between get_file_names() and get_file_infos().
get_file_names() returns by value (safe copy), while get_file_infos() returns by const reference (efficient but can dangle if close() is called while the reference is held). Given that close() clears both vectors, this inconsistency could lead to subtle lifetime bugs.
Consider either:
- Return both by const reference for efficiency and consistent semantics, or
- Return both by value for safety
For SFA-scoped archives with typically small file counts, the performance difference is likely negligible.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/core/src/clp_s/ffi/sfa/ClpArchiveReader.hpp` around lines 92 -
102, The two accessor methods have inconsistent return semantics:
get_file_names() returns a vector by value while get_file_infos() returns a
const reference, which can dangle if close() clears m_file_names/m_file_infos;
make them consistent by returning the same type for both—prefer returning by
value for safety (change get_file_infos() to return std::vector<FileInfo> by
value) or, if you choose efficiency, return const& for both (change
get_file_names() to return std::vector<std::string> const&), and update callers
accordingly; refer to get_file_names(), get_file_infos(), close(), and members
m_file_names/m_file_infos when applying the change.
Description
This change extends
clp_s::ffi:sfa::ClpArchiveReaderto expose per file metadata derived from the range index, including file names and file ranges.The new
ClpArchiveReader::FileInfoclass should roughly match the JS prototype FileInfo.Checklist
breaking change.
Validation performed
Summary by CodeRabbit