You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve segment serialization for v1 segments (#2183)
#### Reference Issues/PRs
Ref Monday Ticket: 7852509418
#### What does this implement or fix?
This PR is fixing a segment fault that occurs in the
`CopyCompressedInterStoreTask` when used in enterprise.
The problem is a bit hard to reproduce as it involves:
1. Having some version V being written with an older version of ArcticDB
(e.g. 4.4.5) to some primary
2. Trying to replicate this version V to some secondary with a version
of enterprise that is using a newer version of ArcticDB (e.g. 5.2.3) and
that version should have a changed structure of the Segments (e.g. what
happened
[here](e3b5f53...757b4bd#diff-a17e4a0ad7760267af8c570795b362ec900136e414782a4b883485d36d9a8f84R167))
3. Because the newer version is trying to write more data than
allocated, we are getting a seg fault
The bug is that `CopyCompressedInterStoreTask` is trying to copy the
Segment but the information about its size is stale because it was
saved/reserialized based on the information from the older version.
The fix involves:
- Add null pointer checks in `decode_header_and_fields` and
`serialize_v1_header_to_buffer`
- the check in `decode_header_and_fields` is not needed for the fix but
it is a good change to prevent silly seg faults
- Simplify `calculate_size()` by removing unnecessary conditional
- Enhance buffer allocation and copying logic in
`serialize_v1_header_to_buffer`
- Add explicit size and pointer validation checks to prevent potential
memory issues
Technically, only [this
change](https://github.com/man-group/ArcticDB/pull/2183/files#diff-d35d87cb4a49c502a6519b27dc94590231057f633894d26c6c34f506fee1980fR154)
is needed for the fix of this bug.
But I think that the change in the
[segment.cpp](https://github.com/man-group/ArcticDB/pull/2183/files#diff-21f07ddd0b2b8ab0b82fd8392af726ab2a0b5ccafedab67d86497177d4f381a2R283)
make the code there easier to read/reason about + I've added some more
checks to make it more robust.
**Because the change is quite hard to test here, I've added the relevant
test for it in the enterprise repo with [this
PR](#2183
## Change Type (Required)
- [x] **Patch** (Bug fix or non-breaking improvement)
- [ ] **Minor** (New feature, but backward compatible)
- [ ] **Major** (Breaking changes)
- [ ] **Cherry pick**
#### Any other comments?
The fix is resolving seg faults like:
``` bash
==40129== Thread 123 IOPool0:
==40129== Invalid write of size 8
==40129== at 0x4C2E8BE: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1035)
==40129== by 0xC22DBCF1: arcticdb::Segment::serialize_v1_header_to_buffer(unsigned long) (segment.cpp:301)
==40129== by 0xC22DBED6: arcticdb::Segment::serialize_header_v1() (segment.cpp:320)
==40129== by 0xC22DBF5D: arcticdb::Segment::serialize_header() (segment.cpp:326)
==40129== by 0xC3BD7A30: arcticdb::storage::s3::S3ClientImpl::put_object(std::string const&, arcticdb::Segment&, std::string const&, arcticdb::storage::s3::PutHeader) (s3_client_impl.cpp:192)
==40129== by 0xC3BF1289: void arcticdb::storage::s3::detail::do_write_impl<arcticdb::storage::object_store_utils::FlatBucketizer>(arcticdb::storage::KeySegmentPair&, std::string const&, std::string const&, arcticdb::storage::s3::S3ClientInterface&, arcticdb::storage::object_store_utils::FlatBucketizer&&) (detail-inl.hpp:140)
==40129== by 0xC3BE98A9: arcticdb::storage::s3::S3Storage::do_write(arcticdb::storage::KeySegmentPair&) (s3_storage.cpp:50)
==40129== by 0xE608111: void arcticdb::storage::Storage::write<arcticdb::storage::KeySegmentPair&>(arcticdb::storage::KeySegmentPair&) (storage.hpp:48)
==40129== by 0xE5F5C8B: arcticdb::storage::Storages::write(arcticdb::storage::KeySegmentPair&) (storages.hpp:46)
==40129== by 0xE5F8245: arcticdb::storage::Library::write(arcticdb::storage::KeySegmentPair&) (library.hpp:90)
==40129== by 0xE698D85: arcticdb::async::AsyncStore<arcticdb::util::SysClock>::write_compressed_sync(arcticdb::storage::KeySegmentPair) (async_store.hpp:180)
==40129== by 0xE70ABDF: arcticdb::async::CopyCompressedInterStoreTask::copy() (tasks.hpp:402)
==40129== Address 0x1416413b0 is 0 bytes after a block of size 1,151,856 alloc'd
==40129== at 0x4C29F73: malloc (vg_replace_malloc.c:309)
==40129== by 0xC373EE45: arcticdb::AllocatorImpl<arcticdb::NullTracingPolicy, arcticdb::util::LinearClock>::internal_alloc(unsigned long) (allocator.cpp:194)
==40129== by 0xC373F143: arcticdb::AllocatorImpl<arcticdb::NullTracingPolicy, arcticdb::util::LinearClock>::aligned_alloc(unsigned long, bool) (allocator.cpp:303)
==40129== by 0xC1DAD983: arcticdb::Buffer::resize(unsigned long) (buffer.hpp:232)
==40129== by 0xC1DAD78A: arcticdb::Buffer::ensure(unsigned long) (buffer.hpp:177)
==40129== by 0xC22DBBC5: arcticdb::Segment::serialize_v1_header_to_buffer(unsigned long) (segment.cpp:283)
==40129== by 0xC22DBED6: arcticdb::Segment::serialize_header_v1() (segment.cpp:320)
==40129== by 0xC22DBF5D: arcticdb::Segment::serialize_header() (segment.cpp:326)
==40129== by 0xC3BD7A30: arcticdb::storage::s3::S3ClientImpl::put_object(std::string const&, arcticdb::Segment&, std::string const&, arcticdb::storage::s3::PutHeader) (s3_client_impl.cpp:192)
==40129== by 0xC3BF1289: void arcticdb::storage::s3::detail::do_write_impl<arcticdb::storage::object_store_utils::FlatBucketizer>(arcticdb::storage::KeySegmentPair&, std::string const&, std::string const&, arcticdb::storage::s3::S3ClientInterface&, arcticdb::storage::object_store_utils::FlatBucketizer&&) (detail-inl.hpp:140)
==40129== by 0xC3BE98A9: arcticdb::storage::s3::S3Storage::do_write(arcticdb::storage::KeySegmentPair&) (s3_storage.cpp:50)
==40129== by 0xE608111: void arcticdb::storage::Storage::write<arcticdb::storage::KeySegmentPair&>(arcticdb::storage::KeySegmentPair&) (storage.hpp:48)
==40129==
```
#### Checklist
<details>
<summary>
Checklist for code changes...
</summary>
- [ ] Have you updated the relevant docstrings, documentation and
copyright notice?
- [ ] Is this contribution tested against [all ArcticDB's
features](../docs/mkdocs/docs/technical/contributing.md)?
- [ ] Do all exceptions introduced raise appropriate [error
messages](https://docs.arcticdb.io/error_messages/)?
- [ ] Are API changes highlighted in the PR description?
- [ ] Is the PR labelled as enhancement or bug so it appears in
autogenerated release notes?
</details>
<!--
Thanks for contributing a Pull Request to ArcticDB! Please ensure you
have taken a look at:
- ArcticDB's Code of Conduct:
https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md
- ArcticDB's Contribution Licensing:
https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing
-->
ARCTICDB_TRACE(log::storage(), "Header doesn't fit in internal buffer, needed {} bytes but had {}, writing to temp buffer at {:x}", hdr_size, buffer_.preamble_bytes(),uintptr_t(tmp->data()));
283
-
tmp->ensure(calculate_size());
283
+
auto bytes_to_copy = buffer().bytes();
284
+
auto offset = FIXED_HEADER_SIZE + hdr_size;
285
+
286
+
auto total_size = offset + bytes_to_copy;
287
+
288
+
// Verify we have enough space for everything
289
+
tmp->ensure(total_size);
290
+
util::check(tmp->available() >= total_size,
291
+
"Buffer available space {} is less than required size {}",
292
+
tmp->available(),
293
+
total_size);
294
+
295
+
// This is both a sanity check and a way to populate the segment with the correct size
296
+
auto calculated_size = calculate_size();
297
+
util::check(total_size == calculated_size, "Expected total size {} to be equal to calculated size {}", total_size, calculated_size);
0 commit comments