-
Notifications
You must be signed in to change notification settings - Fork 649
feat: add __unsymbolized__ label on ingest path #4147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add __unsymbolized__ label on ingest path #4147
Conversation
} | ||
locations := h.symbols.Symbols().Locations | ||
for i := range locations { | ||
HasNativeUnsymbolizedProfiles = HasNativeUnsymbolizedProfiles || len(locations[i].Line) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HasNativeUnsymbolizedProfiles = HasNativeUnsymbolizedProfiles || len(locations[i].Line) == 0
Absence of lines in a location does not mean it's unsymbolized: it might be a malformed profile, a bug, etc. – in fact, in most cases we want to drop such locations (it's possible that we already do so, but #4051 suggests that there's an issue).
Besides, in distributors, we normalize profiles: if a mapping has HasFunctions
set, we're zeroing addresses to reduce the overhead.
https://github.com/grafana/pyroscope/blob/main/pkg/pprof/pprof.go#L414-L428
This means it might be a location without lines and without address, or a location that refers to a stub mapping. In all cases, we can't do anything about it.
We should validate and sanitize mappings and locations at normalization:
- Ensure that locations without lines and without address are discarded.
- Ensure that locations without lines and without a valid mapping (not a stub we create) are discarded.
HasFunctions
is a mapping attribute. If the attribute is set, we should presume that the profile contains functions, and locations only refer to functions present in the profile. We already reset addresses for locations that refer to mappings with HasFunctions
set.
Partially symbolized mappings should not have such attribute – another aspect to handle during normalization.
This is from the official pprof
package:
// HasFunctions determines if all locations in this profile have
// symbolized function information.
func (p *Profile) HasFunctions() bool {
for _, l := range p.Location {
if l.Mapping == nil || !l.Mapping.HasFunctions {
return false
}
}
return true
}
Go pprof:
// TODO: we set HasFunctions if all symbols from samples were symbolized (hasFuncs).
// Decide what to do about HasInlineFrames and HasLineNumbers.
// Also, another approach to handle the mapping entry with
// incomplete symbolization results is to dupliace the mapping
// entry (but with different Has* fields values) and use
// different entries for symbolized locations and unsymbolized locations.
if hasFuncs {
b.pb.bool(tagMapping_HasFunctions, true)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 3aacd69.
LabelValueDatasetTSDBIndex = "dataset_tsdb_index" | ||
LabelNameTenantDataset = "__tenant_dataset__" | ||
LabelValueDatasetTSDBIndex = "dataset_tsdb_index" | ||
LabelNameHasNativeUnsymbolizedProfiles = "__has_native_unsymbolized_profiles__" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can name it the same way we refer to it when we discuss it:
__unsymbolized__
Changing the name later won't be this easy; we should be careful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I changed it after talking with Tolyan. That's why I mentioned to share the thoughts about the naming, otherwise any change after this, it'll be difficult. We've these 3 options, but I like your proposal:
__unsymbolized__
__has_native_unsymbolized_profiles__
__has_native_profiles__
wdyt? @korniltsev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the __unsymbolized__
I did not like the __has_native_profiles__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, native
is too ambiguous
23761fe
to
3aacd69
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we don't have running github actions, lets run tests manually
I think TestIngestOTLP
🔴
or maybe merge in main branch. I think tests workflow is already running in main |
3aacd69
to
74a35d9
Compare
I've just updated the branch from |
@marcsanmi I am sorry for the noise. Just checked and it is red on main branch on my machine 🤷 maybe the test is flaky. I need to double check later |
locations := symbols.Locations | ||
mappings := symbols.Mappings | ||
for _, loc := range locations { | ||
if loc.MappingId == 0 || int(loc.MappingId) >= len(mappings) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think mappingId==0 is a valid mapping as it should be an index into rewritten slice of mappings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the other case int(loc.MappingId) >= len(mappings)
is also impossible to miss given our dedupSlice implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func TestUnsymbolized(t *testing.T) {
profile := &googlev1.Profile{
StringTable: []string{"", "a"},
Function: []*googlev1.Function{
{Id: 4, Name: 1},
},
Mapping: []*googlev1.Mapping{{Id: 239, HasFunctions: true}},
Location: []*googlev1.Location{
{Id: 5, MappingId: 239, Line: []*googlev1.Line{{FunctionId: 4, Line: 1}}},
},
Sample: []*googlev1.Sample{
{LocationId: []uint64{5}, Value: []int64{1}},
},
}
symbols := NewPartitionWriter(0, &Config{
Version: FormatV3,
})
symbols.WriteProfileSymbols(profile)
unsymbolized := HasUnsymbolizedProfiles(symbols.Symbols())
assert.False(t, unsymbolized)
}
Please double check the above profile. It looks like it should not have __unsymbolized__
74a35d9
to
349fe52
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This reverts commit b19fa15.
* feat: add __has_native_unsymbolized_profiles__ label on ingest path * Update unsymbolized check logic & update label name * Fix wrong HasUnsymbolizedProfiles logic and add tests
This reverts commit eb12a50.
* chore: add v2 integration tests * lint * fix metrics checks * maybe make v2 tests less flaky * fix race * fix segment writer shutdown race * feat: add __unsymbolized__ label on ingest path (#4147) * feat: add __has_native_unsymbolized_profiles__ label on ingest path * Update unsymbolized check logic & update label name * Fix wrong HasUnsymbolizedProfiles logic and add tests * Revert "feat: add __unsymbolized__ label on ingest path (#4147)" This reverts commit eb12a50. * Revert "fix segment writer shutdown race" This reverts commit 63b2bd6. * fix segment writer shutdown race with a per shard WaitGroup --------- Co-authored-by: Marc Sanmiquel <[email protected]>
* feat: add __has_native_unsymbolized_profiles__ label on ingest path * Update unsymbolized check logic & update label name * Fix wrong HasUnsymbolizedProfiles logic and add tests
This reverts commit eb12a50.
It fails if run on go 1.24 if I switch back to 1.23 it does not fail. I think we should run tests on 1.24 for easier future upgrades. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for unsymbolized profiles by introducing a new label on the ingest path and updating several functions to use a datasets map instead of heads.
- Refactored ingest and flush functions to use a datasets map.
- Updated logging messages and error messages to reflect the new naming.
- Added tests and helper functions to evaluate unsymbolized profiles.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
pkg/experiment/ingester/segment.go | Renamed heads to datasets and updated related log/error messages. |
pkg/experiment/ingester/memdb/head_test.go | Added tests for unsymbolized profile handling. |
pkg/experiment/ingester/memdb/head.go | Introduced HasUnsymbolizedProfiles logic in flush implementation. |
pkg/experiment/block/metadata/metadata_labels.go | Added new label constant for unsymbolized profiles. |
locations := symbols.Locations | ||
mappings := symbols.Mappings | ||
for _, loc := range locations { | ||
if !mappings[loc.MappingId].HasFunctions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a nil check or verifying that mappings[loc.MappingId] exists before accessing its HasFunctions field to avoid potential nil pointer dereferences.
if !mappings[loc.MappingId].HasFunctions { | |
mapping := mappings[loc.MappingId] | |
if mapping == nil { | |
continue | |
} | |
if !mapping.HasFunctions { |
Copilot uses AI. Check for mistakes.
This PR adds the
__has_native_unsymbolized_profiles__
label on the ingest path.Feel free to share your thought on the label naming