Skip to content

Commit f5b74fa

Browse files
committed
Functionality that will be shared, moved from TraceStream.cc
- Moved into util.cc - Added forward_to to skip trace data to some arbitrary point in time Getters required to expose data We need to be able to expose this data so it can be serialized. Find original exe for ReplayTask Digs out original executable image that this task was forked from, or in the case of exec, exec'd on. This is required for persistent checkpointing, so that the names in the proc fs corresponds to a correct name at replay time (i.e. has the same behavior/looks the same in proc fs as a normal replay). The thread name is not what should be showing up in /proc/tid/comm, but the actual executable. So we need to be able to find this "original exe" of the task. Check if Event is checkpointable Required for the create checkpoints command, etc. to determine what events in the trace are checkpointable, when not having a live session. In future commits/PRs, remove the static function in ReplaySession.cc` that does the same thing and use this member function on Event instead. Additional proc fs query paths Gets additional proc fs paths for a task, in this case /mem. Required for persistent checkpointing to figure out on how to handle mappings and what to serialize (and what not to serialize). Lifted CloneCompletion out of Session The function extract_name will also be required for setting up syscall buffer stuff in coming commits. Getters/setters required for PCP Need to be able to set this data when restoring an address space. Persistent checkpointing Added persistent checkpoint schema for capnproto rr_pcp.capnp, as well a compile command for it in CMakeLists.txt, that works like the other one (rr_trace.capnp) CheckpointInfo and MarkData types works as intermediaries between a serialized checkpoint and a deserialized "live" one. MarkData is used for copying the contents of Mark, InternalMark, ProtoMark and it's various data into, for serialization as well when deserializing, to reconstruct those types. The reasoning for adding MarkData is to not intrude in Mark/InternalMark/ProtoMark interface and possibly break some guarantees or invariants they provide. If something goes wrong now, it's constrained only to persistent checkpointing not reconstituting a session properly. GDB spawned by RR now has 2 additional commands, write-checkpoints, which serializes any checkpoints set by the `checkpoint` command and load-checkpoints. Added the rr create-checkpoints command which create persistent checkpoints on a specified interval, which it attempts to honor as closely as possible. RerunCommand and ReplayCommand are now aware of PCPs. Replay sessions get spawned from persistent checkpoints if they exist on disk when using `-g <evt>` or when using `-f <pid>` and that "task" was created some time after a persistent checkpoint. Added the --ignore-pcp flag to these commands, which ignores pcps and spawns sessions normally. fixup for can_checkpoint_at Restored comments, that existed in static function in ReplaySession.cc Change all use of this to Event::can_checkpoint_at Removed static can_checkpoint_at in ReplaySession.cc Fix preferred include & unnecessary check for partial init Since checkpoints are partially initialized, checking that they are is pointless. Added cmake command looping over trace files per request by @khuey remove init check of member variables. Move extract_name from Session into util.h. Removed stream_util, moved contents to util.h make ignore-pcp not take up '-i' Moved responsibility of de/ser into FdTable and FileMonitor Deserializing and serializing an FdTable is now performed by the class itself instead of in a free function FileMonitor has a public member function that is used for serialization. Each derived type that requires special/additional logic, extends the virtual member function serialize_type. Remove skipMonitoringMappedFd not necessary for serialization, as FdTable is separately restored. Refactor task OS-name setting Task::copy_state sets the OS name of a task in the same fashion that persistent checkpointing sets name. Refactored this functionality into Task::set_name. Also removed the unnecessary `update_prname` from Task::copy_state. update_prname is not a "write to tracee"-operation but a "read from tracee"-operation; and since we already know what name we want to set Task::prname to, we skip this reading from the tracee in Task::copy_state and just set it to the parameter passed in to Task::set_name Add const qualifier Fixes #3678 Refactor so that marks_with_checkpoints is just changed in one place, not arbitrarily access it. Ref counts had the same changes in a previous commit. Fixes a bug for loaded persistent checkpoints where the re-created checkpoints did not get their reference counting correct. This closes #3678 Fixed pread-related bug pread may or may not read the requested size. This was not taken into account for. If the data is large, we probably can't read it, in one go. Fixed signed overflow Added the first test for persistent checkpoints Address missing arch() serialization Previous work assumed everything as native. That doesn't work when debugging a 32-bit mode application on a 64-bit machine. That means some `SupportedArch` fields needs to be serialized as well, so that if the process was a 32-bit application it gets re-created correctly. This was instantly made visible by the provided test (although it did not test it explicitly). Fix issues so that ARM test passes Changed test On arm, we start in a different place than _start (some assembly, possibly handrolled), which made setting a breakpoint at 48 wrong, so had to put the filename:line number. Tests should pass now. `create-checkpoints` should handle "latest-trace" Create checkpoints command need the actual trace dir resolved, so we let `TraceReader` handle that for us.
1 parent 1d59f70 commit f5b74fa

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+3093
-274
lines changed

CMakeLists.txt

+25-11
Original file line numberDiff line numberDiff line change
@@ -556,17 +556,26 @@ endforeach(generated_file)
556556

557557
add_custom_target(Generated DEPENDS ${GENERATED_FILES})
558558

559-
add_custom_command(OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/rr_trace.capnp.c++"
560-
"${CMAKE_CURRENT_BINARY_DIR}/rr_trace.capnp.h"
561-
COMMAND capnp compile
562-
"--src-prefix=${CMAKE_CURRENT_SOURCE_DIR}/src"
563-
"-oc++:${CMAKE_CURRENT_BINARY_DIR}"
564-
"${CMAKE_CURRENT_SOURCE_DIR}/src/rr_trace.capnp"
565-
DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/src/rr_trace.capnp")
566-
set_source_files_properties("${CMAKE_CURRENT_BINARY_DIR}/rr_trace.capnp.c++"
567-
PROPERTIES GENERATED true)
568-
set_source_files_properties("${CMAKE_CURRENT_BINARY_DIR}/rr_trace.capnp.h"
569-
PROPERTIES GENERATED true HEADER_FILE_ONLY true)
559+
560+
set(CAPNP_FILES
561+
rr_trace
562+
rr_pcp
563+
)
564+
565+
# Compile capnproto files
566+
foreach(capnp_file ${CAPNP_FILES})
567+
add_custom_command(OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/${capnp_file}.capnp.c++"
568+
"${CMAKE_CURRENT_BINARY_DIR}/${capnp_file}.capnp.h"
569+
COMMAND capnp compile
570+
"--src-prefix=${CMAKE_CURRENT_SOURCE_DIR}/src"
571+
"-oc++:${CMAKE_CURRENT_BINARY_DIR}"
572+
"${CMAKE_CURRENT_SOURCE_DIR}/src/${capnp_file}.capnp"
573+
DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/src/${capnp_file}.capnp")
574+
set_source_files_properties("${CMAKE_CURRENT_BINARY_DIR}/${capnp_file}.capnp.c++"
575+
PROPERTIES GENERATED true)
576+
set_source_files_properties("${CMAKE_CURRENT_BINARY_DIR}/${capnp_file}.capnp.h"
577+
PROPERTIES GENERATED true HEADER_FILE_ONLY true)
578+
endforeach()
570579

571580
if (${CMAKE_SYSTEM_PROCESSOR} STREQUAL "aarch64")
572581
set(BLAKE_ARCH_DIR third-party/blake2/neon)
@@ -578,12 +587,14 @@ set(RR_SOURCES
578587
src/AddressSpace.cc
579588
src/AutoRemoteSyscalls.cc
580589
src/BuildidCommand.cc
590+
src/CheckpointInfo.cc
581591
src/Command.cc
582592
src/CompressedReader.cc
583593
src/CompressedWriter.cc
584594
src/ContextSwitchEvent.cc
585595
src/CPUFeaturesCommand.cc
586596
src/CPUIDBugDetector.cc
597+
src/CreateCheckpointsCommand.cc
587598
src/DiversionSession.cc
588599
src/DumpCommand.cc
589600
src/Dwarf.cc
@@ -621,6 +632,7 @@ set(RR_SOURCES
621632
src/PackCommand.cc
622633
src/PerfCounters.cc
623634
src/PerfCounterBuffers.cc
635+
src/PersistentCheckpointing.cc
624636
src/PidFdMonitor.cc
625637
src/processor_trace_check.cc
626638
src/ProcFdDirMonitor.cc
@@ -660,6 +672,7 @@ set(RR_SOURCES
660672
src/WaitManager.cc
661673
src/WaitStatus.cc
662674
${CMAKE_CURRENT_BINARY_DIR}/rr_trace.capnp.c++
675+
${CMAKE_CURRENT_BINARY_DIR}/rr_pcp.capnp.c++
663676
${BLAKE_ARCH_DIR}/blake2b.c
664677
)
665678

@@ -1499,6 +1512,7 @@ set(TESTS_WITH_PROGRAM
14991512
# check_session_leaks
15001513
checkpoint_dying_threads
15011514
checkpoint_mixed_mode
1515+
checkpoint_persistent_shmem
15021516
checksum_sanity
15031517
check_lost_interrupts
15041518
clone_file_range

src/AddressSpace.cc

+5
Original file line numberDiff line numberDiff line change
@@ -554,6 +554,11 @@ void AddressSpace::save_auxv(Task* t) {
554554
save_interpreter_base(t, saved_auxv());
555555
}
556556

557+
void AddressSpace::restore_auxv(Task* t, std::vector<uint8_t>&& auxv) {
558+
saved_auxv_ = std::move(auxv);
559+
save_interpreter_base(t, saved_auxv());
560+
}
561+
557562
void AddressSpace::save_interpreter_base(Task* t, std::vector<uint8_t> auxv) {
558563
saved_interpreter_base_ = read_interpreter_base(auxv);
559564
save_ld_path(t, saved_interpreter_base());

src/AddressSpace.h

+20
Original file line numberDiff line numberDiff line change
@@ -662,6 +662,14 @@ class AddressSpace : public HasTaskSet {
662662
* Dies if no shm size is registered for the address.
663663
*/
664664
size_t get_shm_size(remote_ptr<void> addr) { return shm_sizes[addr]; }
665+
666+
/**
667+
* Check if `map` is shared memory
668+
*/
669+
bool has_shm_at(const KernelMapping& map) const {
670+
return shm_sizes.find(map.start()) != std::cend(shm_sizes);
671+
}
672+
665673
void remove_shm_size(remote_ptr<void> addr) { shm_sizes.erase(addr); }
666674

667675
/**
@@ -795,6 +803,9 @@ class AddressSpace : public HasTaskSet {
795803
const std::vector<uint8_t>& saved_auxv() { return saved_auxv_; }
796804
void save_auxv(Task* t);
797805

806+
/* Used when restoring persistent checkpoints. */
807+
void restore_auxv(Task* t, std::vector<uint8_t>&& auxv);
808+
798809
remote_ptr<void> saved_interpreter_base() { return saved_interpreter_base_; }
799810
void save_interpreter_base(Task* t, std::vector<uint8_t> auxv);
800811

@@ -873,6 +884,15 @@ class AddressSpace : public HasTaskSet {
873884

874885
bool legacy_breakpoint_mode() { return stopping_breakpoint_table_ != nullptr; }
875886
remote_code_ptr do_breakpoint_fault_addr() { return do_breakpoint_fault_addr_; }
887+
888+
void set_breakpoint_fault_addr(remote_code_ptr addr) {
889+
do_breakpoint_fault_addr_ = addr;
890+
}
891+
892+
void set_uses_syscall_buffer(bool uses_syscall_buffer = true) {
893+
syscallbuf_enabled_ = uses_syscall_buffer;
894+
}
895+
876896
remote_code_ptr stopping_breakpoint_table() { return stopping_breakpoint_table_; }
877897
int stopping_breakpoint_table_entry_size() { return stopping_breakpoint_table_entry_size_; }
878898

src/BpfMapMonitor.h

+8-1
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,19 @@ class BpfMapMonitor : public FileMonitor {
1414
public:
1515
BpfMapMonitor(uint64_t key_size, uint64_t value_size) : key_size_(key_size), value_size_(value_size) {}
1616

17-
virtual Type type() override { return BpfMap; }
17+
virtual Type type() const override { return BpfMap; }
1818

1919
uint64_t key_size() const { return key_size_; }
2020
uint64_t value_size() const { return value_size_; }
2121

2222
private:
23+
virtual void serialize_type(
24+
pcp::FileMonitor::Builder& builder) const noexcept override {
25+
auto bpf = builder.initBpf();
26+
bpf.setKeySize(key_size_);
27+
bpf.setValueSize(value_size_);
28+
}
29+
2330
uint64_t key_size_;
2431
uint64_t value_size_;
2532
};

0 commit comments

Comments
 (0)