[DBM] Add container tags hash to queries (if enabled)#8061
[DBM] Add container tags hash to queries (if enabled)#8061
Conversation
## Summary of changes
Replaced custom mutex guard with `std::lock_guard`, using
`std::recursive_mutex` instead of `CRITICAL_SECTION` in windows and
`std::mutex` with railings in Linux
## Reason for change
Some locks have been spotted in smoke test wich could be cause by the
lack of thread recursive lock in the `std::mutex`
## Implementation details
## Test coverage
## Other details
<!-- Fixes #{issue} -->
<!-- ⚠️ Note:
Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.
MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4fd01fab6f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| else | ||
| { | ||
| // PropagateDataViaComment (service) - this injects varius trace information as a comment in the query | ||
| if (tracer.Settings.InjectSqlBasehash && !string.IsNullOrEmpty(baseHash)) | ||
| { | ||
| tags.BaseHash = baseHash; |
There was a problem hiding this comment.
Set BaseHash even when DBM comment already present
This new BaseHash tagging only happens in the else branch when the command text is not already DBM-injected. In the cached‑command scenario (or when users pre‑inject DBM comments), alreadyInjected is true, so _dd.propagated_hash is never set on subsequent spans even though the query still carries ddsh in the SQL comment. If DBM looks up container tags by scanning recent spans for that hash, later queries can’t be enriched once the first span ages out. Consider setting tags.BaseHash whenever the feature is enabled (and baseHash is non‑empty), regardless of the alreadyInjected branch.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
hmm, yes, that's an interesting point, but I'm not sure we care, we only need one span with the hash to get the values, so we don't really need to tag all spans. I think in practice it works well like this.
There was a problem hiding this comment.
Consider adding a comment explaining this.
BenchmarksBenchmark execution time: 2026-03-24 11:01:35 Comparing candidate commit 900ac90 in PR branch Found 9 performance improvements and 7 performance regressions! Performance is the same for 258 metrics, 14 unstable metrics.
|
|
I just realized I need to put process tags in there too |
bouwkast
left a comment
There was a problem hiding this comment.
I think the main question that I have is that it appears this is correctly following the RFC in how we propagate the hash, but the merged Python implementation recomputes the hash.
But the RFC isn't precise enough in describing the hash and expected behavior / requirements for me to know which is correct really
tracer/src/Datadog.Trace/DatabaseMonitoring/DatabaseMonitoringPropagator.cs
Show resolved
Hide resolved
eb11b9e to
d81eb71
Compare
| getDiscoveryServiceFunc: static s => DiscoveryService.CreateUnmanaged( | ||
| s.TracerSettings.Manager.InitialExporterSettings, | ||
| ContainerMetadata.Instance, | ||
| new ServiceRemappingHash(null), |
There was a problem hiding this comment.
this one I'm not 100% sure, but since it's only used for DBM for now, I don't think it'd play any role in that code path, so it should be safe to hardcode a disabled instance
2bf7dde to
91dc0c7
Compare
| private const string SqlCommentOuthost = "ddh"; | ||
| private const string SqlCommentVersion = "ddpv"; | ||
| private const string SqlCommentEnv = "dde"; | ||
| private const string SqlCommentBaseHash = "ddsh"; |
There was a problem hiding this comment.
The PR description says
injects the base hash into SQL comments (as
ddch)
I couldn't find either one in the RFC, but the dd-trace-py PR uses ddsh, like this one. Is that a typo in the PR description?
| public string? ContainerTagsHash | ||
| { | ||
| get; | ||
| private set; | ||
| } | ||
|
|
||
| /// <summary> | ||
| /// Gets the base64 representation of the hash | ||
| /// </summary> | ||
| public string? B64Value | ||
| { | ||
| get; | ||
| private set; | ||
| } |
There was a problem hiding this comment.
These properties used to have Volatile.Read()/Volatile.Write() and we should probably keep that since they are written from a background thread in DiscoveryService and read in the hot path when creating spans.
Furthermore, UpdateContainerTagsHash updates both values non-atomically, so a reader could see a stale B64Value with a new ContainerTagsHash. If consistency between the two is important, consider using a lock to read/write both values, or using immutable copies.
| hash = FnvHash64.GenerateHash(containerTagsHash, FnvHash64.Version.V1, hash); | ||
| } | ||
|
|
||
| var b64 = Convert.ToBase64String(BitConverter.GetBytes(hash)); |
There was a problem hiding this comment.
This code is allocating:
byte[] in BitConverter.GetBytes()
char[] for the parameter in TrimEnd(params char[]) in .NET Framework (Newer runtimes have a TrimEnd(char) overload)
string in TrimEnd() if it modifies the string
more string instance for each Replace() if they modify the string
Good news! We have "vendored" versions of BinaryPrimitives and Base64, so we can avoid BitConverter.GetBytes() and Convert.ToBase64String(), and then trimming and replacing 1:1 chars can be done in place, so this code should work in all TFMs:
#if NETCOREAPP3_1_OR_GREATER
Span<byte> buf = stackalloc byte[12];
#else
// can't stackalloc into the vendored Span<T>
var buf = new byte[12];
#endif
BinaryPrimitives.WriteUInt64LittleEndian(buf, hash); // write 8 bytes into a 12-byte buffer
Base64.EncodeToUtf8InPlace(buf, 8, out int bytesWritten);
while (bytesWritten > 0 && buf[bytesWritten - 1] == (byte)'=')
{
bytesWritten--;
}
for (int i = 0; i < bytesWritten; i++)
{
if (buf[i] == (byte)'+')
{
buf[i] = (byte)'-';
}
else if (buf[i] == (byte)'/')
{
buf[i] = (byte)'_';
}
}
#if NETCOREAPP3_1_OR_GREATER
return Encoding.ASCII.GetString(buf[..bytesWritten]);
#else
// can't use Range
return Encoding.ASCII.GetString(buf, 0, bytesWritten);
#endifThis has zero heap allocations on NETCOREAPP3_1_OR_GREATER, and only the byte[12] otherwise (aside from the final string we need to return in both cases which is unavoidable).
| using System.Threading; | ||
| using Datadog.Trace.PlatformHelpers; |
There was a problem hiding this comment.
Not used.
| using System.Threading; | |
| using Datadog.Trace.PlatformHelpers; |
| public static Scope? CreateDbCommandScope(Tracer tracer, IDbCommand command) | ||
| { | ||
| var commandType = command.GetType(); | ||
| var baseHash = tracer.TracerManager.ServiceRemappingHash?.B64Value; |
There was a problem hiding this comment.
Should we guard this behind the setting?
var baseHash = tracer.Settings.DbmInjectSqlBasehash ?
tracer.TracerManager.ServiceRemappingHash?.B64Value :
null;| } | ||
| } | ||
|
|
||
| private static string Compute(string processTags, string? containerTagsHash) |
There was a problem hiding this comment.
While working on the "less-allocatey" code below, I noticed there are no unit tests for this method.
| if (!_warnedOnSet) | ||
| { | ||
| _warnedOnSet = true; | ||
| Log.Error("The code is trying to set the value '{Value}' to {Prop}, but this has no effect in .NET Framework.", value, nameof(ContainerTagsHash)); |
There was a problem hiding this comment.
This log is now gone in the new version. Intentional?
| /// <summary> | ||
| /// Gets the base64 representation of the hash | ||
| /// </summary> | ||
| public string? B64Value |
There was a problem hiding this comment.
[Naming nit] The .NET naming conventions would use Base64Value, here, or simply Base64. No need to abbreviate "Base" to "B".
|
related: #8363 |
Instead of guarding the caller with #if !NETFRAMEWORK, make the setter a silent no-op. This avoids conflict with #8061 which replaces the caller entirely.
Summary of changes
Add the ability to write the container tags hash to DBM queries + to the related span.
The goal is that DBM would then query the spans bearing that hash, and then use the container tags on this (those) spans(s) to enrich the queries with it.
This is controlled by a setting that is disabled by default, and would be enabled if propagation mode is "service" or greater
see RFC: https://docs.google.com/document/d/15GtNOKGBCt6Dc-HsDNnMmCdZwhewFQx8yUlI9in5n3M
related PR in python: DataDog/dd-trace-py#15293
Reason for change
DBM and DSM propagate service context in outbound communications (SQL comments, message headers), but neither product has awareness of the container environment (e.g.,
kube_cluster,namespace,pod_name). Propagating full container tags is not feasible due to cardinality constraints (query cache invalidation in OracleDB/SQLServer, exponential pathway growth in DSM) and size limitations (64–128 bytes for DBM non-comment methods).This is needed for the service renaming initiative (defining services based on container names) and APM primary tags (container-based dimensions like Kubernetes cluster).
The solution: the agent computes a hash of low-cardinality container tags and back-propagates it to the tracer, which includes it in outbound DBM/DSM communications. DBM then resolves the hash by correlating with APM spans that carry the same hash as a span tag.
Implementation details
BaseHashstatic class that computes an FNV-64 hash ofProcessTags.SerializedTagscombined with the container tags hash from the agent, encoded as base64DiscoveryService, stored inContainerMetadata.ContainerTagsHashContainerMetadataconverted from static to instance class (singleton viaContainerMetadata.Instance) to improve testabilityDatabaseMonitoringPropagatorinjects the base hash into SQL comments (asddch) whenDD_DBM_INJECT_SQL_BASEHASHis true_dd.dbm_container_tags_hashspan tag onSqlTagsso DBM can correlate the hash back to the span's container tagsDD_DBM_INJECT_SQL_BASEHASH(disabled by default), intended to be enabled when DBM propagation mode isserviceor higherMinimalAgentHeaderHelperfor agent communicationTest coverage
Adding a test in DbScopeFactoryTests.cs forced me to inject the value from pretty high, which I find a bit "dirty", but at least we don't have to rely on global static instance in tests.
Other details