Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ops[tracing] (with a first-party charm lib) #1612

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

dimaqq
Copy link
Contributor

@dimaqq dimaqq commented Mar 5, 2025

The tracing facility for the Ops framework.

Quick start.

Declare the charm tracing interface and optionally the TLS interface in your
charmcraft.yaml.

    requires:
        charm-tracing:
            interface: tracing
            limit: 1
            optional: true
        send-ca-cert:
            interface: certificate_transfer
            limit: 1
            optional: true

If you're migrating from the charm-tracing charm lib, you most likely already
have these relations, but do note their names.

Caveat: this library pulls in pydantic, and the Rust build packages must be
specified in your charmcraft.yaml.

    parts:
        charm:
            plugin: charm
            source: .
            build-packages:
                - cargo

If you're migrating from the `charm-tracing` charm lib, that should already be the case.

Then add the Tracing object in your charm::

```py
    import ops

    class Charm(ops.CharmBase):
        def __init__(self, framework: ops.Framework):
        super().__init__(framework)
        ...
        self.tracing = ops.tracing.Tracing(
            self,
            tracing_relation_name='charm-tracing',
            ca_relation_name='send-ca-cert',
        )

Note that you don't have to import ops.tracing or import ops_tracing.
When ops[tracing] has been added to your charm's dependencies, the Ops
library imports this library and re-exports it as ops.tracing.

@dimaqq dimaqq force-pushed the feat-otel-ops-tracing-lib branch 2 times, most recently from dd17314 to b283701 Compare March 6, 2025 15:23
@dimaqq dimaqq changed the title very dirty drafty draft: ops[tracing] with a charm lib ops[tracing] with a charm lib Mar 7, 2025
@dimaqq dimaqq changed the title ops[tracing] with a charm lib feat: ops[tracing] with a charm lib Mar 7, 2025
@dimaqq dimaqq force-pushed the feat-otel-ops-tracing-lib branch from ee2f38a to fc56e7d Compare March 7, 2025 03:35
Copy link
Contributor

@tonyandrewmeyer tonyandrewmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mostly looked at _tracing.api since a lot of the rest is the same as in the other PR (I think).

I definitely prefer this approach over the new lifecycle/ops event one. It feels much more natural to only adjust the settings when they change. I think it will be easier to deprecate this when Juju eventually provides most of what's need than in the other approach too. However, I didn't hate the new event approach, so if you feel strongly that it's better then please do make the case for it.

Initialise the tracing thing (FIXME) with the names of these relations.

```py
import ops.tracing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually work at the moment? I can't find any import in ops/__init__.py and there doesn't seem to be an ops/tracing.py that imports things from ops/_tracing.py. Is there some magic I'm missing?

If we're only exposing a Tracing and tracer name, then I think we would want to (conditionally) have those available in the ops namespace, so ops.tracer and ops.Tracing rather than ops.tracing.tracer and ops.tracing.Tracing. With Scenario, we already had ops.testing as a namespace, and there are a lot of names there, only needed for testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've written that with the expectation of the monorepo-style refactor, where:

  • a separate package holds the implementation
  • ops.tracing re-exports the API

Copy link
Contributor Author

@dimaqq dimaqq Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. tracer is or should be private, it's the ops tracer, not the charm tracer.
(akin to logger).

A "tracer" is an object with attributes like name (ops) and version, thus typically each large library has its own. There are no fixed rules there; my 2c is for coarse tracers, e.g.:

  • ops
  • charm
  • [for example] traefik charm lib

@dimaqq dimaqq force-pushed the feat-otel-ops-tracing-lib branch from fc56e7d to ec17c85 Compare March 9, 2025 15:02
@dimaqq dimaqq mentioned this pull request Mar 10, 2025
53 tasks
@dimaqq dimaqq force-pushed the feat-otel-ops-tracing-lib branch from e1d71b1 to 4cc1209 Compare March 13, 2025 08:51
@dimaqq dimaqq mentioned this pull request Mar 14, 2025
3 tasks
@dimaqq dimaqq changed the title feat: ops[tracing] with a charm lib feat: ops[tracing] (with a first-party charm lib) Mar 14, 2025
@dimaqq dimaqq marked this pull request as ready for review March 14, 2025 08:46
Copy link
Contributor

@tonyandrewmeyer tonyandrewmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only an initial review, sorry. I have ignored the vendored files, and haven't had time to review the tests. I also haven't had time to try it out myself yet. I'll try to get to the tests and trying it out early next week.

A bunch of mostly small comments as a starter.

Really nice work! This is clearly very large (even ignoring whitespace changes) and a significant addition to ops and charming. I really like the design and how you've implemented it (other than a few small quibbles mostly around names and privacy).

@@ -0,0 +1,45 @@
# ops-tracing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we name this README.md for consistency?

# ops-tracing

**First-party OpenTelemetry integration for the [ops](https://pypi.org/project/ops/) framework.**
This package adds tracing capabilities to ops, enabling you to observe the performance of your applications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I believe we're going with "Ops" as the name.

Suggested change
This package adds tracing capabilities to ops, enabling you to observe the performance of your applications.
This package adds tracing capabilities to Ops, enabling you to observe the performance of your applications.

(Elsewhere in the readme as well).

{name="The Charm Tech team at Canonical Ltd."},
]
classifiers = [
"Development Status :: 5 - Production/Stable",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I know people do it, but I personally don't think Production/Stable should be used with a <1.0.0 version number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll bump the version to match ops before release.

from dataclasses import dataclass

EXPORT_TIMEOUT: int | float = 1 # seconds
"""How much to give OTLP span exporter has to push traces to the backend."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""How much to give OTLP span exporter has to push traces to the backend."""
"""How much time to give OTLP span exporter to push traces to the backend."""


from dataclasses import dataclass

EXPORT_TIMEOUT: int | float = 1 # seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the all-caps, probably this should be Final[int | float] instead? Same for the ones below.

Also: do you want to have these as public? Is there a reason that the charmer would use them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A regular charmer probably wouldn't.
If someone has custom tracing setup, e.g. to some SaaS, then they may want longer timeouts and would be using .set_dest... manually instead of the charm lib.

import time
import urllib.error
import urllib.request
from pathlib import Path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import pathlib :)

Comment on lines +25 to +26
from opentelemetry.sdk.trace import ReadableSpan
from opentelemetry.sdk.trace.export import SpanExporter, SpanExportResult
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to have these names public. What about renaming this module to _export.py?

Comment on lines +25 to +26
from opentelemetry.sdk.trace import ReadableSpan
from opentelemetry.sdk.trace.export import SpanExporter, SpanExportResult
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to have these names public. What about renaming this module to _export.py?


return SpanExportResult.SUCCESS
except Exception:
logger.exception('Exporing tracing data')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.exception('Exporing tracing data')
logger.exception('Exporting tracing data')

@@ -0,0 +1,73 @@
# Copyright 2025 Canonical Ltd.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used src-layout for Scenario, as it seemed to be the general recommendation for new Python projects. Should we use that here too? ops isn't, but is the legacy project in the repo.

Or do you think src-layout was a mistake and maybe we should move Scenario to not use it instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about it...
I went for the concise option, and it just works?
uv init --package ... does generate an src layout, while plain uv init generates a layoutless app.

I'm not sure about consistency for consistency's sake.

@dimaqq
Copy link
Contributor Author

dimaqq commented Mar 14, 2025

Note: weirdly the built docs show the content from a few commits back. 🤷🏻

@tonyandrewmeyer
Copy link
Contributor

Note: weirdly the built docs show the content from a few commits back. 🤷🏻

Your last published version maybe? See my comment buried somewhere in there about the hack to tox.ini I had to use for Scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants