Skip to content

Bump pgmonitor to v5.2.1 #4186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 23, 2025

Conversation

dsessler7
Copy link
Contributor

Checklist:

  • Have you added an explanation of what your changes do and why you'd like them to be included?
  • Have you updated or added documentation for the change, as applicable?
  • Have you tested your changes on all related environments with successful results, as applicable?
    • Have you added automated tests?

Type of Changes:

  • New feature
  • Bug fix
  • Documentation
  • Testing enhancement
  • Other

What is the current behavior (link to any open issues here)?

Our OTel metrics currently use a mixture of metrics from pgMonitor v5.1.1 and earlier.

What is the new behavior (if this is a feature change)?

  • Breaking change (fix or feature that would cause existing functionality to change)

This PR brings most of the metrics up to their equivalent implementations in pgMonitor v5.2.1.

Other Information:

SELECT c.buffers_checkpoint AS buffers_written
FROM pg_catalog.pg_stat_bgwriter c;
metrics:
- metric_name: ccp_stat_bgwriter_buffers_checkpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When did the ccp_stat_bgwriter_* metrics go away?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all ccp_stat_bgwriter_* metrics went away, but in pgMonitor v5.1.0 I think, ones that were "checkpoint" related moved to ccp_stat_checkpointer and then in pgmonitor-extension, the function checks if the PG version is gte 17, and if so, it uses pg_stat_checkpointer and if it's less than 17 it uses pg_stat_bgwriter.

Comment on lines +74 to +77
DROP FUNCTION IF EXISTS get_replication_lag();
--- get_replication_lag is used by the OTel collector.
--- get_replication_lag is created as function, so that we can query without warning on a replica.
CREATE OR REPLACE FUNCTION get_replication_lag() RETURNS TABLE(bytes NUMERIC) AS $$
CREATE FUNCTION get_replication_lag() RETURNS TABLE(replica text, bytes NUMERIC) AS $$
Copy link
Contributor

@benjaminjb benjaminjb May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Is there a reason to prefer "drop/create" vs. "create or replace"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I changed some things inside the function I started getting errors with CREATE OR REPLACE and it said I needed to DROP the function instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I wonder what I'm missing -- I wonder if "CREATE OR REPLACE" can't actually change some aspects....

Ah well, not a problem here, just something I was curious about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it definitely doesn't like to replace the function if the signature changes but I think it also complained when other non-signature related things changed...

Copy link
Contributor

@benjaminjb benjaminjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, AFAICT -- I used a compare between 5.1.1 and 5.2.1 in pgmonitor to check that we have all the changes and one thing I noticed is that we don't have a ccp_autovacuum_workers metric. (I only noticed that because they moved it btw 5.1.1 and 5.2.1.) Is this not a metric we use?

@dsessler7
Copy link
Contributor Author

LGTM, AFAICT -- I used a compare between 5.1.1 and 5.2.1 in pgmonitor to check that we have all the changes and one thing I noticed is that we don't have a ccp_autovacuum_workers metric. (I only noticed that because they moved it btw 5.1.1 and 5.2.1.) Is this not a metric we use?

I don't see any references to ccp_autovacuum_workers in our monitoring installer.

Combine ccp_archive_command_status queries into one query.
Add semicolons to the end of all queries.
Make ccp_replication_lag_size return the replica name for grafana dashboard legend.
DROP functions rather than CREATE OR REPLACE to avoid errors due to changes in functions.
@dsessler7 dsessler7 merged commit e274e01 into CrunchyData:main May 23, 2025
19 checks passed
@dsessler7 dsessler7 deleted the bump-pgmonitor-to-v5.2.1 branch May 23, 2025 18:46
@dsessler7 dsessler7 force-pushed the bump-pgmonitor-to-v5.2.1 branch 2 times, most recently from 77bdfa5 to da63454 Compare May 23, 2025 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants