Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 0 additions & 82 deletions .github/workflows/deploy-profiling.yaml

This file was deleted.

7 changes: 7 additions & 0 deletions .github/workflows/deploy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ on:
branches: [main]
tags: [v*]
workflow_dispatch:
inputs:
features:
description: 'Optional cargo features (currently supported: "mimalloc-allocator")'
required: false
default: ''
Comment on lines +7 to +11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! This can be made even more user friendly by using type: choice and providing a list of options (see here). That way you don't even have to know the exact feature names. (obviously runs the risk of having the action out of sync with the actually supported features).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking about a list, but that would be more difficult to test new features since, in that case, you would need to provide custom inputs. AFAIK, GH doesn’t support a hybrid "preset options + custom text" input. A choice input only accepts values listed in options.


jobs:
deploy:
Expand Down Expand Up @@ -40,6 +45,8 @@ jobs:
push: true
tags: ${{ steps.meta_services.outputs.tags }}
labels: ${{ steps.meta_services.outputs.labels }}
build-args: |
CARGO_BUILD_FEATURES=${{ github.event.inputs.features && format('--features {0}', github.event.inputs.features) || '' }}

- name: Migration image metadata
id: meta_migration
Expand Down
6 changes: 0 additions & 6 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 1 addition & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,7 @@ ARG CARGO_BUILD_FEATURES=""

# Install dependencies
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked apt-get update && \
apt-get install -y git libssl-dev pkg-config && \
if echo "${CARGO_BUILD_FEATURES}" | grep -q "jemalloc-profiling"; then \
apt-get install -y build-essential; \
fi
apt-get install -y git libssl-dev pkg-config build-essential
# Install Rust toolchain
RUN rustup install stable && rustup default stable

Expand Down
22 changes: 8 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,28 +113,22 @@ tokio-console

## Heap Profiling

All binaries support opt-in heap profiling using jemalloc's profiling capabilities. This allows you to analyze memory usage in production environments without restarting services.
All binaries use jemalloc as the default memory allocator with built-in heap profiling support. Profiling is enabled at runtime via the `MALLOC_CONF` environment variable, allowing you to analyze memory usage in production environments without recompiling or restarting services.

### Building with Heap Profiling
**Note:** You can optionally use mimalloc instead of jemalloc by building with `--features mimalloc-allocator`, but this disables heap profiling capability.

Build with the `jemalloc-profiling` feature:
```bash
cargo build --release --features jemalloc-profiling
```
### Enabling Heap Profiling

Or with Docker:
To enable heap profiling, run services with the `MALLOC_CONF` environment variable set:
```bash
docker build --build-arg CARGO_BUILD_FEATURES="--features jemalloc-profiling" .
MALLOC_CONF="prof:true,prof_active:true,lg_prof_sample:22"
```

### Generating Heap Dumps
When profiling is enabled, each binary opens a UNIX socket at `/tmp/heap_dump_<binary_name>.sock`.

When running with the profiling feature enabled, each binary opens a UNIX socket at `/tmp/heap_dump_<binary_name>.sock`. To generate a heap dump, connect to the socket and send the "dump" command:
### Generating Heap Dumps

**Note:** Services must be run with the `MALLOC_CONF` environment variable set:
```bash
MALLOC_CONF="prof:true,prof_active:true,lg_prof_sample:19"
```
Connect to the socket and send the "dump" command:

```bash
# From Kubernetes
Expand Down
7 changes: 3 additions & 4 deletions crates/alerter/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,8 @@ anyhow = { workspace = true }
clap = { workspace = true }
humantime = { workspace = true }
observe = { workspace = true }
mimalloc = { workspace = true }
tikv-jemallocator = { workspace = true, optional = true }
jemalloc_pprof = { workspace = true, optional = true }
mimalloc = { workspace = true, optional = true }
tikv-jemallocator = { workspace = true }
model = { workspace = true }
number = { workspace = true }
prometheus = { workspace = true }
Expand All @@ -30,4 +29,4 @@ warp = { workspace = true }
workspace = true

[features]
jemalloc-profiling = ["dep:tikv-jemallocator", "dep:jemalloc_pprof", "observe/jemalloc-profiling"]
mimalloc-allocator = ["dep:mimalloc"]
2 changes: 1 addition & 1 deletion crates/alerter/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -392,7 +392,7 @@ pub async fn start(args: impl Iterator<Item = String>) {
);
observe::tracing::initialize(&obs_config);
observe::panic_hook::install();
#[cfg(all(unix, feature = "jemalloc-profiling"))]
#[cfg(unix)]
observe::heap_dump_handler::spawn_heap_dump_handler();
observe::metrics::setup_registry(Some("gp_v2_alerter".to_string()), None);
tracing::info!("running alerter with {:#?}", args);
Expand Down
8 changes: 4 additions & 4 deletions crates/alerter/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#[cfg(feature = "jemalloc-profiling")]
#[cfg(feature = "mimalloc-allocator")]
#[global_allocator]
static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

#[cfg(not(feature = "jemalloc-profiling"))]
#[cfg(not(feature = "mimalloc-allocator"))]
#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;
static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;

#[tokio::main]
async fn main() {
Expand Down
7 changes: 3 additions & 4 deletions crates/autopilot/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,8 @@ humantime = { workspace = true }
indexmap = { workspace = true }
itertools = { workspace = true }
maplit = { workspace = true }
mimalloc = { workspace = true }
tikv-jemallocator = { workspace = true, optional = true }
jemalloc_pprof = { workspace = true, optional = true }
mimalloc = { workspace = true, optional = true }
tikv-jemallocator = { workspace = true }
model = { workspace = true }
num = { workspace = true }
number = { workspace = true }
Expand Down Expand Up @@ -78,4 +77,4 @@ vergen = { workspace = true, features = ["git", "gitcl"] }
workspace = true

[features]
jemalloc-profiling = ["dep:tikv-jemallocator", "dep:jemalloc_pprof", "observe/jemalloc-profiling"]
mimalloc-allocator = ["dep:mimalloc"]
8 changes: 4 additions & 4 deletions crates/autopilot/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#[cfg(feature = "jemalloc-profiling")]
#[cfg(feature = "mimalloc-allocator")]
#[global_allocator]
static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

#[cfg(not(feature = "jemalloc-profiling"))]
#[cfg(not(feature = "mimalloc-allocator"))]
#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;
static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;

#[tokio::main]
async fn main() {
Expand Down
2 changes: 1 addition & 1 deletion crates/autopilot/src/run.rs
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ pub async fn start(args: impl Iterator<Item = String>) {
);
observe::tracing::initialize(&obs_config);
observe::panic_hook::install();
#[cfg(all(unix, feature = "jemalloc-profiling"))]
#[cfg(unix)]
observe::heap_dump_handler::spawn_heap_dump_handler();

let commit_hash = option_env!("VERGEN_GIT_SHA").unwrap_or("COMMIT_INFO_NOT_FOUND");
Expand Down
7 changes: 3 additions & 4 deletions crates/driver/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,8 @@ humantime = { workspace = true }
humantime-serde = { workspace = true }
hyper = { workspace = true }
itertools = { workspace = true }
mimalloc = { workspace = true }
tikv-jemallocator = { workspace = true, optional = true }
jemalloc_pprof = { workspace = true, optional = true }
mimalloc = { workspace = true, optional = true }
tikv-jemallocator = { workspace = true }
moka = { workspace = true, features = ["future"] }
num = { workspace = true }
number = { workspace = true }
Expand Down Expand Up @@ -93,4 +92,4 @@ vergen = { workspace = true, features = ["git", "gitcl"] }
workspace = true

[features]
jemalloc-profiling = ["dep:tikv-jemallocator", "dep:jemalloc_pprof", "observe/jemalloc-profiling"]
mimalloc-allocator = ["dep:mimalloc"]
2 changes: 1 addition & 1 deletion crates/driver/src/infra/observe/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ pub mod metrics;
pub fn init(obs_config: observe::Config) {
observe::tracing::initialize_reentrant(&obs_config);
metrics::init();
#[cfg(all(unix, feature = "jemalloc-profiling"))]
#[cfg(unix)]
observe::heap_dump_handler::spawn_heap_dump_handler();
}

Expand Down
8 changes: 4 additions & 4 deletions crates/driver/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#[cfg(feature = "jemalloc-profiling")]
#[cfg(feature = "mimalloc-allocator")]
#[global_allocator]
static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

#[cfg(not(feature = "jemalloc-profiling"))]
#[cfg(not(feature = "mimalloc-allocator"))]
#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;
static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;

#[tokio::main]
async fn main() {
Expand Down
3 changes: 1 addition & 2 deletions crates/observe/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,11 @@ tracing-opentelemetry = { workspace = true }
tracing-subscriber = { workspace = true, features = ["env-filter", "fmt", "time"] }
warp = { workspace = true }
tracing-serde = { workspace = true }
jemalloc_pprof = { workspace = true, optional = true }
jemalloc_pprof = { workspace = true }

[lints]
workspace = true

[features]
default = []
axum-tracing = ["axum"]
jemalloc-profiling = ["dep:jemalloc_pprof"]
15 changes: 8 additions & 7 deletions crates/observe/src/heap_dump_handler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ use {
/// When "dump" command is sent, it generates a heap profile using
/// jemalloc_pprof and streams the binary protobuf data back through the socket.
///
/// Profiling is enabled at runtime via the MALLOC_CONF environment variable.
/// Set MALLOC_CONF=prof:true to enable heap profiling.
///
/// Usage:
/// ```bash
/// # From your local machine (one-liner):
Expand All @@ -19,21 +22,19 @@ use {
/// # Analyze with pprof:
/// go tool pprof -http=:8080 heap.pprof
/// ```
#[cfg(all(unix, feature = "jemalloc-profiling"))]
pub fn spawn_heap_dump_handler() {
// Check if jemalloc profiling is available before spawning the handler
// This prevents panics that would crash the entire process
// Check if jemalloc profiling is available at runtime
// This depends on whether MALLOC_CONF=prof:true was set
let profiling_available =
std::panic::catch_unwind(|| jemalloc_pprof::PROF_CTL.as_ref().is_some()).unwrap_or(false);

if !profiling_available {
tracing::warn!(
"jemalloc profiling not available - heap dump handler not started. Ensure service is \
built with jemalloc-profiling feature and MALLOC_CONF is set."
);
// Profiling is disabled - do nothing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not merge blocking but long term it would be nice if we could always spawn the dump handler and also support enabling profiling while the process is already running. This is possible with jemalloc but I'm not sure if just updating the env variable after starting the process would be sufficient for that.
What I have in mind is only measuring the memory growth without having to manually diff to memory dumps. In other words:

  1. start pod
  2. wait 1h to let caches fill to a reasonable degree
  3. enable profiling
  4. wait 24h
  5. collect memory dump
  6. analyze only the memory allocated during the 24h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it doesn't work well with delayed profiling(tikv/jemallocator#140). Probably, if the profiler missed the root allocation, it is not possible to record detached children, I don't know.

Also, this overcomplicates the solution, where the unix socket connection would need to accept more commands, etc. Anyway, I can try to do that in the future. But since the enabled profiling doesn't really affect the performance, this is not so critical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the unix socket connection would need to accept more commands

This is already implemented for the filter reloading logic so not terrible. But obviously if the feature doesn't work well in jemalloc itself it clearly doesn't make sense to add this complexity.
In any case I'm fine to merge without this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already implemented for the filter reloading logic so not terrible

This is how the memory profiler dumps currently work as well.

return;
}

tracing::info!("jemalloc heap profiling is active");

tokio::spawn(async move {
let name = binary_name().unwrap_or("unknown".to_string());
let socket_path = format!("/tmp/heap_dump_{name}.sock");
Expand Down
2 changes: 1 addition & 1 deletion crates/observe/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
pub mod config;
pub mod distributed_tracing;
pub mod future;
#[cfg(all(unix, feature = "jemalloc-profiling"))]
#[cfg(unix)]
pub mod heap_dump_handler;
pub mod metrics;
pub mod panic_hook;
Expand Down
Loading
Loading