Skip to content

Update OpAMP client to have additional callback methods#4322

Open
kelseyma wants to merge 3 commits intoopen-telemetry:mainfrom
kelseyma:update_callbacks
Open

Update OpAMP client to have additional callback methods#4322
kelseyma wants to merge 3 commits intoopen-telemetry:mainfrom
kelseyma:update_callbacks

Conversation

@kelseyma
Copy link

Description

Update the OpAMP client to use a Callbacks class, bringing Python closer to the existing opamp-go and Java opamp-client implementations. Adding a minimal set of callbacks at the moment (on_connect, on_connect_failed, on_error, on_message) with no-op default methods so new callbacks can be added in the future without breaking existing subclasses.
Hoping to work on client-side sending of messages after state changes next, which might add some additional callbacks and also allow us to not have to pass OpAMPAgent into the callbacks.

Also, removed some logging as I found it to be a little verbose while testing (e.g. 100 line traceback logged when can't connect to server) and exception was logged in multiple places. Happy to add back if preferred to keep.

Note: this is a breaking change to the OpAMPAgent constructor (message_handler is replaced by Callbacks), but the package hasn't been released yet. Some prior discussion on this: #3635 (comment)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Tested with a local sample app (implemented callbacks subclass) and test server

// on_connect and on_message
2026-03-09 14:35:35,197 - client - INFO - Connected to OpAMP server
2026-03-09 14:35:35,197 - opentelemetry._opamp.agent - DEBUG - Job succeeded: b'\n\x10\x01\x9c\xd4\x84!\x84q\xe0\xb8&\xd6\xa6\t\xbb\xd4\xab\x10\x06 \x87`'
2026-03-09 14:35:35,197 - client - DEBUG - Received OpAMP message (flags=0)
2026-03-09 14:35:35,198 - client - INFO - Received remote config (hash: fb0142b148fbf4a1)
2026-03-09 14:35:35,198 - config - INFO - Log level changed from debug to INFO
2026-03-09 14:35:35,199 - client - INFO - Sent config APPLIED status with effective config
2026-03-09 14:35:35,207 - client - INFO - Connected to OpAMP server
2026-03-09 14:36:05,195 - client - INFO - Connected to OpAMP server
2026-03-09 14:36:35,194 - client - INFO - Connected to OpAMP server
2026-03-09 14:37:05,215 - client - INFO - Connected to OpAMP server

// on_connect (throwing an exception) and on_connect_failed 
2026-03-09 14:27:10,099 - urllib3.connectionpool - DEBUG - http://localhost:8080 "POST /v1/opamp HTTP/1.1" 200 18
2026-03-09 14:27:10,100 - opentelemetry._opamp.agent - ERROR - Error when invoking function 'on_connect'
Traceback (most recent call last):
  File "/.../opentelemetry-python-contrib/opamp/opentelemetry-opamp-client/src/opentelemetry/_opamp/agent.py", line 34, in _safe_invoke
    function(*args)
    ~~~~~~~~^^^^^^^
  File "/.../sample-apps/docker/python-app/client.py", line 69, in on_connect
    raise ValueError("Testing _safe_invoke error handling")
ValueError: Testing _safe_invoke error handling
2026-03-09 14:27:10,101 - opentelemetry._opamp.agent - DEBUG - Job succeeded: b'\n\x10\x01\x9c\xd4~\xb6\xd4r\x81\x8a\x98\x12u \xe05\x17\x10\x01 \x87`'
2026-03-09 14:27:10,101 - client - DEBUG - Received OpAMP message (flags=0)
2026-03-09 14:27:40,092 - opentelemetry._opamp.agent - DEBUG - Periodic job enqueued
2026-03-09 14:27:40,094 - urllib3.connectionpool - DEBUG - Resetting dropped connection: localhost
2026-03-09 14:27:40,096 - client - WARNING - Failed to connect to OpAMP server: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /v1/opamp (Caused by NewConnectionError("HTTPConnection(host='localhost', port=8080): Failed to establish a new connection: [Errno 61] Connection refused"))
2026-03-09 14:27:40,096 - opentelemetry._opamp.agent - WARNING - Job b'\n\x10\x01\x9c\xd4~\xb6\xd4r\x81\x8a\x98\x12u \xe05\x17\x10\x02 \x87`' failed attempt 1/2: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /v1/opamp (Caused by NewConnectionError("HTTPConnection(host='localhost', port=8080): Failed to establish a new connection: [Errno 61] Connection refused"))
2026-03-09 14:27:40,096 - opentelemetry._opamp.agent - DEBUG - Retrying in 0.9s
2026-03-09 14:27:40,991 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (2): localhost:8080
2026-03-09 14:27:40,995 - client - WARNING - Failed to connect to OpAMP server: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /v1/opamp (Caused by NewConnectionError("HTTPConnection(host='localhost', port=8080): Failed to establish a new connection: [Errno 61] Connection refused"))
2026-03-09 14:27:40,996 - opentelemetry._opamp.agent - WARNING - Job b'\n\x10\x01\x9c\xd4~\xb6\xd4r\x81\x8a\x98\x12u \xe05\x17\x10\x02 \x87`' failed attempt 2/2: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /v1/opamp (Caused by NewConnectionError("HTTPConnection(host='localhost', port=8080): Failed to establish a new connection: [Errno 61] Connection refused"))
2026-03-09 14:27:40,996 - opentelemetry._opamp.agent - ERROR - Job b'\n\x10\x01\x9c\xd4~\xb6\xd4r\x81\x8a\x98\x12u \xe05\x17\x10\x02 \x87`' dropped after max retries
  • tox

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@kelseyma kelseyma marked this pull request as ready for review March 10, 2026 22:41
@kelseyma kelseyma requested a review from a team as a code owner March 10, 2026 22:41
from opentelemetry._opamp.client import OpAMPClient


class Callbacks:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we think about making this an ABC (mostly to prevent it from being accidentally instantiated)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, I see that the go client makes them all optional

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking into ABC and in order to keep all optional, it seemed like Callbacks() could still be instantiated. And if we wanted to fully prevent this, we'd need to mark at least one method (e.g. on_message) as abstract, but that would make it mandatory for subclasses.
Do we have a preferred behavior for the class? Go makes all callbacks optional, while the java client requires them. I think it could be reasonable to require on_message as that provides most of the value from using opamp client. Happy to align with whichever behavior we want to standardize on!

Copy link
Contributor

@herin049 herin049 Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be slightly overkill, but we can always do the following:

class Callbacks(abc.ABC):
    def __new__(cls, *args, **kwargs):
        if cls is Callbacks:
            raise TypeError(f"Cannot instantiate abstract class '{cls.__name__}'")
        return super().__new__(cls, *args, **kwargs)

Even without this check, by extending abc.ABC most linters/IDEs should give a warning when attempting to directly instantiate Callbacks

job.payload,
job.attempt,
job.max_retries,
job.max_retries + 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the plus 1?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When testing, I noticed the logs said this:

WARNING - Job xyz failed attempt 1/1
WARNING - Job xyz failed attempt 2/1

max_retries is 1, meaning we are doing 2 attempts total (normal attempt + 1 retry). I added plus 1 so the logs now read:

WARNING - Job xyz failed attempt 1/2
WARNING - Job xyz failed attempt 2/2



def test_dispatch_order():
"""Verify the opamp-go dispatch order: on_connect -> on_message -> on_error."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opamp-go?

logger.error(
"Job %r dropped after max retries", job.payload
)
logger.exception(exc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing this? The on_connect_failed callback does not have the job and doesn't know if we are giving up

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception (minus the traceback) is logged earlier in the failed attempt msg, and we’re still logging when we’re giving up. During testing for on_connect_failed, I was getting a 100+ lines traceback in the logs each connection attempt which made it a bit noisy. Happy to add it back if we’d prefer to keep the traceback

logger.warning(
"Job %r handler failed with: %s", job.payload, exc
_safe_invoke(
self._callbacks.on_message, self, self._client, message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to pass the entire message here? Looks like the go implementation passes a subset of the message fields to on_message, and does not include error_response. Would also remove ambiguity about handling errors in on_message rather than on_error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants