Skip to content

Comments

PMM-14267 Fix persistence of changed listen ports#5007

Open
ademidoff wants to merge 3 commits intov3from
PMM-14267-fix-persistence-of-changed-listen-ports
Open

PMM-14267 Fix persistence of changed listen ports#5007
ademidoff wants to merge 3 commits intov3from
PMM-14267-fix-persistence-of-changed-listen-ports

Conversation

@ademidoff
Copy link
Member

@ademidoff ademidoff commented Feb 6, 2026

PMM-14267

Link to the Feature Build: SUBMODULES-4217

This pull request introduces a mechanism for immediate VictoriaMetrics configuration updates when an agent's listen port changes, preventing stale port scraping. It adds a new method to the prometheusService interface, updates the handler logic to detect port changes, and provides comprehensive unit tests for this detection. Additionally, mock implementations and interface definitions across multiple packages are updated to support the new method.

Immediate VictoriaMetrics configuration update on port changes:

  • Added ForceConfigurationUpdate(ctx context.Context) error to the prometheusService interface in managed/services/agents/deps.go, managed/services/inventory/deps.go, managed/services/management/deps.go, and managed/services/server/deps.go, along with documentation explaining its purpose for critical updates like port changes. [1] [2] [3] [4]
  • Updated agent state change handling in managed/services/agents/handler.go to detect listen port changes using the new checkPortChanged function, and trigger ForceConfigurationUpdate for VictoriaMetrics when a port change is detected. [1] [2] [3] [4]
  • Replaced direct agent reloads with models.FindAgentByID for consistency and reliability in agent lookup. [1] [2]

Testing and mocks:

  • Added a comprehensive unit test for checkPortChanged in managed/services/agents/handler_test.go, covering edge cases such as agent not found, old port zero, port equality, port difference, and type conversion.
  • Updated mock implementations in managed/services/inventory/mock_prometheus_service_test.go, managed/services/management/mock_prometheus_service_test.go, and managed/services/server/mock_prometheus_service_test.go to support the new ForceConfigurationUpdate method. [1] [2] [3]

Dependency and configuration updates:

  • Upgraded github.com/DATA-DOG/go-sqlmock dependency from v1.5.0 to v1.5.2 in go.mod for improved testing support.
  • Enabled issue-845-fix in .mockery.yaml as per the warning message.
  • Removed the go-consistent tool for two major reasons:
    • part of its functionality is already covered by golangci-lint
    • it is inflexible and difficult to configure, for example it can't reliably skip tests, which is annoying

@codecov
Copy link

codecov bot commented Feb 6, 2026

Codecov Report

❌ Patch coverage is 28.00000% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.86%. Comparing base (ae3203a) to head (d8e960e).
⚠️ Report is 18 commits behind head on v3.

Files with missing lines Patch % Lines
managed/services/agents/handler.go 36.84% 12 Missing ⚠️
...anaged/services/victoriametrics/victoriametrics.go 0.00% 3 Missing ⚠️
managed/services/vmalert/vmalert.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##               v3    #5007      +/-   ##
==========================================
- Coverage   45.90%   45.86%   -0.05%     
==========================================
  Files         366      367       +1     
  Lines       38438    38552     +114     
==========================================
+ Hits        17646    17680      +34     
- Misses      19093    19172      +79     
- Partials     1699     1700       +1     
Flag Coverage Δ
admin 17.82% <ø> (ø)
agent 53.52% <ø> (+0.19%) ⬆️
managed 46.45% <28.00%> (-0.15%) ⬇️
vmproxy 72.09% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ademidoff ademidoff force-pushed the PMM-14267-fix-persistence-of-changed-listen-ports branch 2 times, most recently from 278d929 to ca76922 Compare February 8, 2026 14:08
@ademidoff ademidoff force-pushed the PMM-14267-fix-persistence-of-changed-listen-ports branch from ca76922 to e66c34f Compare February 8, 2026 15:03
golangci_lint_flags: "-c=.golangci.yml"
golangci_lint_version: v2.6.2 # Version should match specified in Makefile

- name: Run go-consistent
Copy link
Member Author

@ademidoff ademidoff Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm removing the go-consistent tool for two major reasons:

  • part of its functionality is already covered by golangci-lint
  • it is inflexible and difficult to configure, for example it can't reliably skip tests, which requires workarounds and is simply annoying


agent := &models.Agent{AgentID: agentID}
err := q.Reload(agent)
agent, err := models.FindAgentByID(q, agentID)
Copy link
Member Author

@ademidoff ademidoff Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored in favor of using the model.

disable-version-string: True
with-expecter: False
resolve-type-alias: False
issue-845-fix: True
Copy link
Member Author

@ademidoff ademidoff Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed a mockery warning.

@ademidoff ademidoff marked this pull request as ready for review February 11, 2026 22:23
@ademidoff ademidoff requested a review from a team as a code owner February 11, 2026 22:23
@ademidoff ademidoff requested review from JiriCtvrtka and maxkondr and removed request for a team February 11, 2026 22:23
RequestConfigurationUpdate()
// ForceConfigurationUpdate triggers immediate synchronous configuration update,
// bypassing the batch delay. Use this for critical updates like port changes.
ForceConfigurationUpdate(ctx context.Context) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a separate method is required? Can't port change be handled via RequestConfigurationUpdate?

Copy link
Member Author

@ademidoff ademidoff Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is synchronous, and RequestConfigurationUpdate is async. We previously used that one, and it worked well, except in cases where the port number is changed. In such case, we need to synchronously update VM's configuration, otherwise there will be bugs like the one in question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants