Skip to content

feat: setup otlp prometheus exporter with correct invocation in node#1087

Merged
0xmovses merged 24 commits into
mainfrom
Icarus131/prometheus-otlp
Apr 15, 2025
Merged

feat: setup otlp prometheus exporter with correct invocation in node#1087
0xmovses merged 24 commits into
mainfrom
Icarus131/prometheus-otlp

Conversation

@0xIcarus
Copy link
Copy Markdown
Contributor

@0xIcarus 0xIcarus commented Mar 7, 2025

Summary

Changelog

Modified existing tracing crate to support prometheus metrics thru OTLP. Created an exporter and an example invocation in the movement full node

Testing

Command used to run the prometheus exporter:

nix develop --command bash -c "just movement-full-node native build.setup.eth-local.celestia-local.telemetry --keep-project"

Before running, make sure prometheus is installed locally:

brew install prometheus

Outstanding issues

apenzk and others added 16 commits February 11, 2025 08:04
Co-authored-by: Mikhail Zabaluev <mikhail.zabaluev@movementlabs.xyz>
Co-authored-by: Andy Golay <andygolay@gmail.com>
Co-authored-by: musitdev <philippe.delrieu@free.fr>
Co-authored-by: Mikhail Zabaluev <mikhail.zabaluev@movementlabs.xyz>
Co-authored-by: Richard Melkonian <35300528+0xmovses@users.noreply.github.com>
Co-authored-by: Icarus131 <anirudhprasad131@protonmail.com>
Co-authored-by: primata <primata@movementlabs.xyz>
Co-authored-by: Icarus131 <anirudhprasad131@gmail.com>
Co-authored-by: Richard Melkonian <r.v.melkonian@gmail.com>
Co-authored-by: Andy Golay <andygolay@gmail.com>
Co-authored-by: musitdev <philippe.delrieu@free.fr>
Co-authored-by: Mikhail Zabaluev <mikhail.zabaluev@movementlabs.xyz>
Co-authored-by: Richard Melkonian <35300528+0xmovses@users.noreply.github.com>
Co-authored-by: Icarus131 <anirudhprasad131@protonmail.com>
Co-authored-by: primata <primata@movementlabs.xyz>
Co-authored-by: Icarus131 <anirudhprasad131@gmail.com>
Co-authored-by: Richard Melkonian <r.v.melkonian@gmail.com>
Co-authored-by: Radu Popa <radupopa21be@gmail.com>
Co-authored-by: Andy Golay <andygolay@gmail.com>
Co-authored-by: musitdev <philippe.delrieu@free.fr>
Co-authored-by: Mikhail Zabaluev <mikhail.zabaluev@movementlabs.xyz>
Co-authored-by: Richard Melkonian <35300528+0xmovses@users.noreply.github.com>
Co-authored-by: Icarus131 <anirudhprasad131@protonmail.com>
Co-authored-by: primata <primata@movementlabs.xyz>
Co-authored-by: Icarus131 <anirudhprasad131@gmail.com>
Co-authored-by: Richard Melkonian <r.v.melkonian@gmail.com>
Co-authored-by: Radu Popa <radupopa21be@gmail.com>
Co-authored-by: Liam Monninger <l.mak.monninger@gmail.com>
…#1075)

Co-authored-by: Andy Golay <andygolay@gmail.com>
Co-authored-by: musitdev <philippe.delrieu@free.fr>
Co-authored-by: Mikhail Zabaluev <mikhail.zabaluev@movementlabs.xyz>
Co-authored-by: Richard Melkonian <35300528+0xmovses@users.noreply.github.com>
Co-authored-by: Icarus131 <anirudhprasad131@protonmail.com>
Co-authored-by: primata <primata@movementlabs.xyz>
Co-authored-by: Icarus131 <anirudhprasad131@gmail.com>
Co-authored-by: Richard Melkonian <r.v.melkonian@gmail.com>
Co-authored-by: Radu Popa <radupopa21be@gmail.com>
@0xIcarus 0xIcarus marked this pull request as ready for review April 1, 2025 16:51
@0xmovses
Copy link
Copy Markdown
Contributor

0xmovses commented Apr 1, 2025

Ok first, all the merge conflicts need to be resolved and merge main in to this branch, that will resolve the diff here. Thanks!

Comment thread .github/workflows/checks-all.yml Outdated
@0xIcarus 0xIcarus force-pushed the Icarus131/prometheus-otlp branch from 760da1f to bf2124e Compare April 3, 2025 15:19
Copy link
Copy Markdown
Contributor

@0xmovses 0xmovses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing a local step for dev?

when running the cmd, prometheus process yields
bash: line 4: prometheus: command not found

cargo check --all-target gives some errors also:

error: cannot find macro `debug` in this scope
  --> protocol-units/execution/maptos/opt-executor/src/executor/execution.rs:81:3
   |
81 |         debug!("Block tx execution: {:?}", tx_execution_results);
   |         ^^^^^
   |
help: consider importing one of these items
   |
1  + use aptos_logger::debug;
   |
1  + use tracing::debug;
   |

error[E0425]: cannot find function `aptos_test_root_address` in this scope
   --> protocol-units/execution/maptos/opt-executor/src/executor/execution.rs:339:4
    |
339 |             aptos_test_root_address(),
    |             ^^^^^^^^^^^^^^^^^^^^^^^ not found in this scope
    |
help: consider importing one of these items
    |
233 +     use aptos_sdk::types::account_config::aptos_test_root_address;
    |
233 +     use aptos_types::account_config::aptos_test_root_address;
    |

error[E0433]: failed to resolve: use of undeclared type `AccountKey`
   --> protocol-units/execution/maptos/opt-executor/src/executor/execution.rs:340:4
    |
340 |             AccountKey::from_private_key(private_key),
    |             ^^^^^^^^^^ use of undeclared type `AccountKey`
    |
help: consider importing this struct
    |
233 +     use aptos_sdk::types::AccountKey;
    |

error[E0425]: cannot find function `aptos_test_root_address` in this scope
   --> protocol-units/execution/maptos/opt-executor/src/executor/execution.rs:455:4
    |
455 |             aptos_test_root_address(),
    |             ^^^^^^^^^^^^^^^^^^^^^^^ not found in this scope
    |
help: consider importing one of these items
    |
233 +     use aptos_sdk::types::account_config::aptos_test_root_address;
    |
233 +     use aptos_types::account_config::aptos_test_root_address;
    |

error[E0433]: failed to resolve: use of undeclared type `AccountKey`
   --> protocol-units/execution/maptos/opt-executor/src/executor/execution.rs:456:4
    |
456 |             AccountKey::from_private_key(private_key),
    |             ^^^^^^^^^^ use of undeclared type `AccountKey`
    |
help: consider importing this struct
    |
233 +     use aptos_sdk::types::AccountKey;
    |

warning: unused import: `aptos_crypto::ValidCryptoMaterialStringExt`
   --> protocol-units/execution/maptos/opt-executor/src/executor/execution.rs:238:6
    |
238 |     use aptos_crypto::ValidCryptoMaterialStringExt;
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
note: the lint level is defined here
   --> protocol-units/execution/maptos/opt-executor/src/lib.rs:4:8
    |
4   | #[warn(unused_imports)]
    |        ^^^^^^^^^^^^^^

error: could not compile `maptos-opt-executor` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...
Some errors have detailed explanations: E0425, E0433.
For more information about an error, try `rustc --explain E0425`.
warning: `maptos-opt-executor` (lib test) generated 1 warning
error: could not compile `maptos-opt-executor` (lib test) due to 5 previous errors; 1 warning emitte

Comment thread .github/workflows/checks-all.yml Outdated
run: |
nix develop --command bash -c "just movement-full-node native build.setup.eth-local.celestia-local.test-ggp-gas-fee -t=false"

core-resource-signer:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this test has been added, it runs test-ggp-gas-fee, but we're already running that elsewhere in this file.

ports:
- "3000:3000"
volumes:
- ./grafana/provisioning:/etc/grafana/provisioning
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nicolaslpf will these values be the same in prod? Or do you imagine we will need some telelmetry.local.yml and a telemetry.production.yml?

@0xIcarus 0xIcarus force-pushed the Icarus131/prometheus-otlp branch from bf2124e to cb806d9 Compare April 7, 2025 12:02
@ganymedio
Copy link
Copy Markdown
Contributor

Is this the expected output? (Failed to read meta.json):
image

Also I'm noticing when I run this it tends to cause my laptop to hang up, at least when I try click on tui or hit Ctrl+C to quit, it's not reponsive... not sure if that's just something I'm getting locally, or if others are experiencing it?

@0xIcarus
Copy link
Copy Markdown
Contributor Author

0xIcarus commented Apr 7, 2025

Is this the expected output? (Failed to read meta.json): image

Also I'm noticing when I run this it tends to cause my laptop to hang up, at least when I try click on tui or hit Ctrl+C to quit, it's not reponsive... not sure if that's just something I'm getting locally, or if others are experiencing it?

@andygolay ideally it should launch prometheus and should be accessible at localhost:9091. I have not encountered this error before. I will try reproducing it on my end. As for the tui hanging, its because of how process compose is currently handling prometheus, you can just kill the prometheus process to bring back the tui (can use htop). I am working on a fix for that currently.

@ganymedio
Copy link
Copy Markdown
Contributor

In Prometheus (at http://localhost:9091/targets) I'm getting Error scraping target: Get "http://127.0.0.1:9464/metrics": dial tcp 127.0.0.1:9464: connect: connection refused

image

@0xIcarus
Copy link
Copy Markdown
Contributor Author

0xIcarus commented Apr 7, 2025

In Prometheus (at http://localhost:9091/targets) I'm getting Error scraping target: Get "http://127.0.0.1:9464/metrics": dial tcp 127.0.0.1:9464: connect: connection refused

image

@andygolay, it should work now

@0xIcarus 0xIcarus requested review from 0xmovses and ganymedio April 7, 2025 16:13
Copy link
Copy Markdown
Contributor

@ganymedio ganymedio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to get some metrics
image
after about 10 - 15 minutes the process-compose stopped being responsive and then I started getting Error scraping target: Get "http://127.0.0.1:9464/metrics": dial tcp 127.0.0.1:9464: connect: connection refused again in the Prometheus targets page.

So it did work, but then crashed after a while. I'll defer to @0xmovses @musitdev @mzabaluev as to whether that's sufficient to merge this PR, or whether the process-compose reliability should be fixed more thoroughly first?

@0xIcarus
Copy link
Copy Markdown
Contributor Author

0xIcarus commented Apr 9, 2025

I have updated the command to be used to run this in the PR description. When run without the tests, it seems to work perfectly fine, without any of the crashing issues.

Comment thread .github/workflows/checks-all.yml Outdated
env:
CELESTIA_LOG_LEVEL: FATAL # adjust the log level while debugging
run: |
nix develop --command bash -c "just movement-full-node native build.setup.eth-local.celestia-local.test-ggp-gas-fee -t=false"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Icarus131 I'm unclear on why this duplicate ggp test is added. Was it perhaps meant to be a CI check for the telemetry?

I wasn't sure how you're testing the docker-compose.telemetry.yml ... how does that work? Just general container tests? Or should there be a test for telemetry in CI?

Copy link
Copy Markdown
Contributor

@ganymedio ganymedio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup looks good; I've run Prometheus with the updated command and tried generating a few histograms; the service didn't crash for over 20 minutes while I was testing it.

Nice job @Icarus131

@0xmovses
Copy link
Copy Markdown
Contributor

0xmovses commented Apr 9, 2025

It's running all ok for me. I did a brew install prometheus but how are you guys getting the UI up? I don't have Docker runnings, assuming that's a prerequisite?

@0xIcarus
Copy link
Copy Markdown
Contributor Author

0xIcarus commented Apr 9, 2025

It's running all ok for me. I did a brew install prometheus but how are you guys getting the UI up? I don't have Docker runnings, assuming that's a prerequisite?

@0xmovses a prometheus session is spawned at localhost:9091, should be accessible as soon as it starts running in process compose. Once full node also starts running, you will be able to see it active in the /targets route

@0xmovses
Copy link
Copy Markdown
Contributor

0xmovses commented Apr 9, 2025

Ok, trying agin. Thanks.

@0xmovses
Copy link
Copy Markdown
Contributor

0xmovses commented Apr 9, 2025

Forget my last message, after some time, it loaded at :9091! 👍

Copy link
Copy Markdown
Contributor

@0xmovses 0xmovses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We define ENV vars using this pattern here.

https://github.com/movementlabsxyz/movement/blob/main/protocol-units/execution/maptos/util/src/config/common.rs

Could you use this pattern? Otherwise the implementation looks good.

Comment thread networks/movement/movement-client/src/bin/e2e/followers_consistent.rs Outdated
Comment thread networks/movement/movement-full-node/src/main.rs Outdated
Comment thread process-compose/movement-full-node/process-compose.telemetry.yml
Comment thread protocol-units/da/movement/protocol/light-node/src/main.rs Outdated
Comment thread util/tracing/src/lib.rs Outdated
Comment thread protocol-units/bridge/contracts/minter/sources/minter.move Outdated
@0xIcarus 0xIcarus requested review from 0xmovses and mzabaluev April 10, 2025 20:05
Copy link
Copy Markdown
Contributor

@0xmovses 0xmovses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@0xmovses
Copy link
Copy Markdown
Contributor

Ah one conflict to resolve then this can merge in.

@0xmovses 0xmovses merged commit 236c2be into main Apr 15, 2025
8 checks passed
@@ -0,0 +1,26 @@
version: "3"

environment:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we had a test for this as well. Is that somewhere in here? Is it added to?

https://github.com/movementlabsxyz/movement/pull/697/files#diff-57bb71e304dbeaa770299637cc3af79f7dc71aac30d1232ff25e8194af69cccb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants