Skip to content

Add internal profiling to debug tool #32423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

SangJunBak
Copy link
Contributor

@SangJunBak SangJunBak commented May 6, 2025

See commit messages for details

Motivation

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@SangJunBak SangJunBak force-pushed the jun/debug-tool/add-internal-profiling branch 2 times, most recently from 7a2d70c to 754036d Compare May 6, 2025 17:56
SangJunBak added 3 commits May 6, 2025 15:34
- Instead of waiting until port forwarding, we read from stderr until we know for sure port forwarding has started
- Renamed `create_kubectl_port_forwarder` to `create_pg_wire_port_forwarder` since we'll be port forwarding more than once
- Removed the retry mechanism given we don't see the port forwarding disconnecting too often and we can't do it AND initial acknowledgement together easily
- Tie child process to a struct instead of a tokio spawn.
We create an initialization function that preprocesses as much as possible and maps args to context values.
I kept the trait but realized the enum is a bit redundant, especially since we have the Context enum
@SangJunBak SangJunBak force-pushed the jun/debug-tool/add-internal-profiling branch from 754036d to a2867a8 Compare May 7, 2025 20:20
SangJunBak added 5 commits May 8, 2025 00:26
Given we're always creating a PathBuf from start time, it makes sense to just pass the PathBuf itself
- Use stdout instead of stderr
- Remove sharing of context
- Fix bug where process was being killed on connecting to port forward
- Introduced InternalHttpDumpClient for downloading and saving heap profile data and Prometheus metrics from internal HTTP endpoints.
- Changes service for port forwarding from balancerd to environmentd for better consistency with metrics
- Refactor "find" methods for port forwarding to be more modular. Separates out port finding / port forwarder creation from service finding
- Introduced a new asynchronous function `get_container_ip` to fetch the IP address of a Docker container using its ID. Doing this means we don't need the user to map ports when doing `DOCKER RUN`
- Changes mz_connection_url for the emulator to an optional argument and global now that we have the container's IP
- Update CLI flags in docs and clippy/lint errors
@SangJunBak SangJunBak requested a review from ParkMyCar May 13, 2025 07:09
@SangJunBak SangJunBak changed the title WIP Add internal profiling to debug tool May 13, 2025
@SangJunBak SangJunBak marked this pull request as ready for review May 13, 2025 14:34
@SangJunBak SangJunBak requested a review from a team as a code owner May 13, 2025 14:34
Copy link
Contributor

@kay-kim kay-kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs part lgtm

Copy link
Member

@ParkMyCar ParkMyCar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet!

@@ -1,7 +1,7 @@
[package]
name = "mz-debug"
description = "Debug tool for self-managed Materialize."
version = "0.1.0"
version = "0.2.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bumping the version like this is a good habit, but out of curiosity is it necessary? do we publish it anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, just something manual I do whenever I decide to publish haha. We do publish it as a curlable binary that we expose via the docs!

static INTERNAL_HOST_ADDRESS: &str = "127.0.0.1";
static INTERNAL_HTTP_PORT: i32 = 6878;

/// A struct that handles downloading and saving profile data from HTTP endpoints
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we always try to end comments with periods, in other words, make them complete sentences.

output_path: &Path,
) -> Result<(), anyhow::Error> {
// Try HTTPS first, then fall back to HTTP if that fails
let mut url = format!("https://{}", relative_url);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of manually formatting the URL like this I would opt to use the url crate and then the set_scheme(...) method on the Url type


/// Downloads and saves heap profile data
pub async fn dump_heap_profile(&self, relative_url: &str, service_name: &str) -> Result<()> {
let output_dir = self.context.base_path.join("profiles");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these directory names should be constants that are grouped at the top of the file? That way it's easier to determine what folders the debug tool might create? Not blocking but just a thought

ContainerServiceDumper::Docker(dumper) => dumper.dump_container_resources().await,
}
}
enum DebugModeContext {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like how this turned out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants