Skip to content

Refactor Monitor API to public Alive API#229

Open
NicolasFussberger wants to merge 19 commits into
eclipse-score:mainfrom
etas-contrib:feature/rename-monitor-to-alive
Open

Refactor Monitor API to public Alive API#229
NicolasFussberger wants to merge 19 commits into
eclipse-score:mainfrom
etas-contrib:feature/rename-monitor-to-alive

Conversation

@NicolasFussberger

@NicolasFussberger NicolasFussberger commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Refactors the Launch Manager "Monitor" interface to the Alive API.

Example usage:

#include "score/mw/lifecycle/alive.h"

score::mw::lifecycle::Alive alive("instance-name");

while(!shutdownRequested) {
  // Do stuff 
  // ...

  alive.ReportAlive();
}
let alive = alive_rs::Alive::new("my instance").expect("Failed to create alive instance");
alive.report_alive();

Public bazel targets:
//score/launch_manager:alive_cc
//score/launch_manager:alive_rust

Changes:

  • Rename Monitor class to Alive
  • Remove the possibility to report different checkpoint ids from the public API
  • Since this is a public launch manager API, it is moved to new top-level score/launch_manager/alive folder

Open Questions:

  • Feature Architecture defines method "report_health_status" which seems odd if we are only reporting an alive notification. What was the intention of this interface definition?

Closes: #149

@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown

License Check Results

🚀 The license check job ran with the Bazel command:

bazel run --lockfile_mode=error //:license-check

Status: ⚠️ Needs Review

Click to expand output
[License Check Output]
Extracting Bazel installation...
Starting local Bazel server (8.4.2) and connecting to it...
INFO: Invocation ID: 08354aaf-5637-4ef8-9892-17eca1e21929
Computing main repo mapping: 
Computing main repo mapping: 
Loading: 
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
    currently loading: 
Loading: 0 packages loaded
    currently loading: 
Analyzing: target //:license-check (1 packages loaded, 0 targets configured)
Analyzing: target //:license-check (1 packages loaded, 0 targets configured)

Analyzing: target //:license-check (31 packages loaded, 10 targets configured)

Analyzing: target //:license-check (86 packages loaded, 10 targets configured)

Analyzing: target //:license-check (138 packages loaded, 2720 targets configured)

Analyzing: target //:license-check (147 packages loaded, 5559 targets configured)

Analyzing: target //:license-check (158 packages loaded, 8130 targets configured)

Analyzing: target //:license-check (158 packages loaded, 8130 targets configured)

Analyzing: target //:license-check (159 packages loaded, 8254 targets configured)

INFO: Analyzed target //:license-check (163 packages loaded, 10268 targets configured).
[10 / 16] [Prepa] Expanding template external/score_tooling+/dash/tool/formatters/dash_format_converter [for tool]
[14 / 16] JavaToolchainCompileBootClasspath external/rules_java+/toolchains/platformclasspath.jar; 0s disk-cache, processwrapper-sandbox
[15 / 16] Building license.check.license_check.jar (); 0s disk-cache, multiplex-worker
INFO: Found 1 target...
Target //:license.check.license_check up-to-date:
  bazel-bin/license.check.license_check
  bazel-bin/license.check.license_check.jar
INFO: Elapsed time: 19.870s, Critical Path: 2.44s
INFO: 16 processes: 12 internal, 3 processwrapper-sandbox, 1 worker.
INFO: Build completed successfully, 16 total actions
INFO: Running command line: bazel-bin/license.check.license_check ./formatted.txt <args omitted>
usage: org.eclipse.dash.licenses.cli.Main [-batch <int>] [-cd <url>]
       [-confidence <int>] [-ef <url>] [-excludeSources <sources>] [-help] [-lic
       <url>] [-project <shortname>] [-repo <url>] [-review] [-summary <file>]
       [-timeout <seconds>] [-token <token>]

@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown

The created documentation from the pull request is available at: docu-html

Comment thread examples/rust_supervised_app/Cargo.toml
Comment thread score/launch_manager/alive/BUILD Outdated
Comment thread score/launch_manager/alive/src/details/BUILD Outdated
score_testing_macros.workspace = true
containers.workspace = true
monitor_rs = { workspace = true, optional = true }
alive_rs = { workspace = true, optional = true }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is optional correct here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that is optional because there is also a stub implementation for the interface within health_monitor.
But the cargo build is not working for me locally, neither before nor with these changes.

Is the cargo build still in use @pawelrutkaq ?


private:
/// @brief Unique pointer to implementation class of Alive
std::unique_ptr<AliveImpl> aliveImplPtr;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pimpl Alive (AliveImpl), does not sound as any benefit really here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit of the pimpl is that we can keep the public API minimal.
If I were to remove the AliveImpl class everything that is included there would need to become now part of the public API and those classes are meant to stay internal so that we can freely change them later.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really in my opinion - public is what is public that is. rest can be private/protected and you change it. pimpl here is just code duplication.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are two big benefits here for using pimpl pattern.

1. Keeping implementation details private

The AliveImpl header includes the following internal headers:
#include "score/mw/launch_manager/alive_monitor/details/ifappl/DataStructures.hpp"
#include "score/mw/launch_manager/alive_monitor/details/ipc/IpcClient.hpp"
#include "score/mw/launch_manager/alive_monitor/details/logging/PhmLogger.hpp"
which again include other internal headers.

When removing the AliveImpl class, those headers would go to the Alive header and thus be required to be part of the public API (+all their transitive headers).
Those are all implementation details that should stay private, otherwise we risk easily breaking users when we change those classes in the future. Furthermore, users may start relying on these implementation details (e.g. specific shared memory layout as defined in DataStructures.hpp).

2. ABI compatibility

Size and layout of Alive class will stay unchanged. This can become important when delivering score stack not as bazel modules but as a set of precompiled binaries.

While both are clear benefits in my opinion I would say that the first benefit is the major one.
I don't see why we would expose all these implementation details as part of the API if we can avoid it.
I would say the cost of code duplication is rather small here, compared to the benefit.

/// @brief Reports an occurrence of a Checkpoint
/// @param [in] f_checkpointId Checkpoint identifier.
void ReportCheckpoint(Checkpoint f_checkpointId) const noexcept(true);
void ReportCheckpoint(std::uint32_t f_checkpointId) const noexcept(true);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things: First i dont see any beenfit having Alive and AliveImpl. Seconds is that Impl provides an extended interface (checkpoint id instead just alive notification), why?

@NicolasFussberger NicolasFussberger Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the implementation still deals with checkpoints.
I started removing this from the public API as a first step. The rest of the code is yet to be simplified. I did not want to propagate this change here as it would result in massive amount of changes.
But of course you are right, there numbered checkpoint is already obsolete when there is only an alive notification

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so shall we link some ticket for cleanu p ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a ticket here #251

@ramceb

ramceb commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

FT-Team meeting: 2026-06-10: The report_health_status api shall also provide the possibility to report "not ok" state to shorten FTTI times in some use-cases.

@NicolasFussberger

Copy link
Copy Markdown
Contributor Author

FT-Team meeting: 2026-06-10: The report_health_status api shall also provide the possibility to report "not ok" state to shorten FTTI times in some use-cases.

Okay understood. Then I guess the C++ API would look like this:

enum class HealthStatus : uint8_t {
  kOk,
  kNotOk
};

class Alive {
  public:
    void ReportHealthStatus(HealthStatus status);
};

Though I am also wondering if having a separate method for the failure reporting would be more explicit.
Something like:

class Alive {
  public:
    void ReportAlive();
    void ReportFailure();
};

What do you think @pawelrutkaq @ramceb ?

@pawelrutkaq

pawelrutkaq commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

For me,no matter if enum or static names. Just lets dont have two interfaces for the same ;)

@@ -0,0 +1,89 @@
/********************************************************************************

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file name lower case, other upper case ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other public APIs in launch manager are all with lower case.
So now it is consistent at least to the outside.
We'll have to adapt the private code later

Comment thread score/launch_manager/alive/src/alive.h Outdated
void score_lcm_alive_report_failure(void* instance) noexcept;
#ifdef __cplusplus
}
#endif

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not need to be part of header or ? can be in cpp ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, I removed it from the header. Seems enough to have this in the cpp file.


void Alive::ReportFailure() const noexcept
{
// Not implemented

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an assertion that fails when this method is called.


void score_lcm_alive_report_failure(void* instance) noexcept {
static_cast<score::mw::lifecycle::Alive*>(instance)->ReportFailure();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing nullptr checks

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what should be the behavior in case nullptr is passed, this would always be a programming error.
Silently ignoring the nullptr, might hide such programming errors.
I introduced now the baselibs assert macros as nullptr check.

* Introduce nullptr asserts
* For non-implement method ReportFailure(), use assert rather than
deprecation warning.
* Remove C functions from header file. Seems enough to have them in the
  cpp file
{
}

Alive::Alive(Alive&& se) noexcept :

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this move constructor and move assignment can just be = default

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in c7019a4

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a grep for Monitor and found the following that might need to be updated:

  • ./score/launch_manager/daemon/src/alive_monitor/details/factory/FlatCfgFactory.cpp:93 - log mentions health monitor
  • ./score/launch_manager/alive/src/details/AliveImpl.h:28 - the brief should be updated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 0fe8760

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

Rename current "HealthMonitor" API in LaunchManager to "Alive"

5 participants