Skip to content

feat: add __unsymbolized__ label on ingest path #4147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 30, 2025

Conversation

marcsanmi
Copy link
Contributor

This PR adds the __has_native_unsymbolized_profiles__ label on the ingest path.

Feel free to share your thought on the label naming

@marcsanmi marcsanmi requested a review from a team as a code owner April 27, 2025 19:03
}
locations := h.symbols.Symbols().Locations
for i := range locations {
HasNativeUnsymbolizedProfiles = HasNativeUnsymbolizedProfiles || len(locations[i].Line) == 0
Copy link
Collaborator

@kolesnikovae kolesnikovae Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HasNativeUnsymbolizedProfiles = HasNativeUnsymbolizedProfiles || len(locations[i].Line) == 0

Absence of lines in a location does not mean it's unsymbolized: it might be a malformed profile, a bug, etc. – in fact, in most cases we want to drop such locations (it's possible that we already do so, but #4051 suggests that there's an issue).

Besides, in distributors, we normalize profiles: if a mapping has HasFunctions set, we're zeroing addresses to reduce the overhead.

https://github.com/grafana/pyroscope/blob/main/pkg/pprof/pprof.go#L414-L428

This means it might be a location without lines and without address, or a location that refers to a stub mapping. In all cases, we can't do anything about it.

We should validate and sanitize mappings and locations at normalization:

  1. Ensure that locations without lines and without address are discarded.
  2. Ensure that locations without lines and without a valid mapping (not a stub we create) are discarded.

HasFunctions is a mapping attribute. If the attribute is set, we should presume that the profile contains functions, and locations only refer to functions present in the profile. We already reset addresses for locations that refer to mappings with HasFunctions set.

Partially symbolized mappings should not have such attribute – another aspect to handle during normalization.

This is from the official pprof package:

// HasFunctions determines if all locations in this profile have
// symbolized function information.
func (p *Profile) HasFunctions() bool {
	for _, l := range p.Location {
		if l.Mapping == nil || !l.Mapping.HasFunctions {
			return false
		}
	}
	return true
}

Go pprof:

	// TODO: we set HasFunctions if all symbols from samples were symbolized (hasFuncs).
	// Decide what to do about HasInlineFrames and HasLineNumbers.
	// Also, another approach to handle the mapping entry with
	// incomplete symbolization results is to dupliace the mapping
	// entry (but with different Has* fields values) and use
	// different entries for symbolized locations and unsymbolized locations.
	if hasFuncs {
		b.pb.bool(tagMapping_HasFunctions, true)
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 3aacd69.

LabelValueDatasetTSDBIndex = "dataset_tsdb_index"
LabelNameTenantDataset = "__tenant_dataset__"
LabelValueDatasetTSDBIndex = "dataset_tsdb_index"
LabelNameHasNativeUnsymbolizedProfiles = "__has_native_unsymbolized_profiles__"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can name it the same way we refer to it when we discuss it:

__unsymbolized__

Changing the name later won't be this easy; we should be careful.

Copy link
Contributor Author

@marcsanmi marcsanmi Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I changed it after talking with Tolyan. That's why I mentioned to share the thoughts about the naming, otherwise any change after this, it'll be difficult. We've these 3 options, but I like your proposal:

  • __unsymbolized__
  • __has_native_unsymbolized_profiles__
  • __has_native_profiles__

wdyt? @korniltsev

Copy link
Collaborator

@korniltsev korniltsev Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the __unsymbolized__
I did not like the __has_native_profiles__

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, native is too ambiguous

@marcsanmi marcsanmi force-pushed the marcsanmi/add-native-profiles-symbolization-label branch from 23761fe to 3aacd69 Compare April 29, 2025 06:39
@marcsanmi marcsanmi changed the title feat: add __has_native_unsymbolized_profiles__ label on ingest path feat: add __unsymbolized__ label on ingest path Apr 29, 2025
Copy link
Collaborator

@korniltsev korniltsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we don't have running github actions, lets run tests manually

I think TestIngestOTLP 🔴

@korniltsev
Copy link
Collaborator

or maybe merge in main branch. I think tests workflow is already running in main

@marcsanmi marcsanmi force-pushed the marcsanmi/add-native-profiles-symbolization-label branch from 3aacd69 to 74a35d9 Compare April 29, 2025 20:03
@marcsanmi
Copy link
Contributor Author

I've just updated the branch from main, but the tests are passing 😕

@korniltsev
Copy link
Collaborator

@marcsanmi I am sorry for the noise. Just checked and it is red on main branch on my machine 🤷 maybe the test is flaky. I need to double check later

locations := symbols.Locations
mappings := symbols.Mappings
for _, loc := range locations {
if loc.MappingId == 0 || int(loc.MappingId) >= len(mappings) ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mappingId==0 is a valid mapping as it should be an index into rewritten slice of mappings

Copy link
Collaborator

@korniltsev korniltsev Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the other case int(loc.MappingId) >= len(mappings) is also impossible to miss given our dedupSlice implementation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func TestUnsymbolized(t *testing.T) {
	profile := &googlev1.Profile{
		StringTable: []string{"", "a"},
		Function: []*googlev1.Function{
			{Id: 4, Name: 1},
		},
		Mapping: []*googlev1.Mapping{{Id: 239, HasFunctions: true}},
		Location: []*googlev1.Location{
			{Id: 5, MappingId: 239, Line: []*googlev1.Line{{FunctionId: 4, Line: 1}}},
		},
		Sample: []*googlev1.Sample{
			{LocationId: []uint64{5}, Value: []int64{1}},
		},
	}
	symbols := NewPartitionWriter(0, &Config{
		Version: FormatV3,
	})
	symbols.WriteProfileSymbols(profile)
	unsymbolized := HasUnsymbolizedProfiles(symbols.Symbols())
	assert.False(t, unsymbolized)
}

Please double check the above profile. It looks like it should not have __unsymbolized__

@marcsanmi marcsanmi force-pushed the marcsanmi/add-native-profiles-symbolization-label branch from 74a35d9 to 349fe52 Compare April 30, 2025 10:32
@marcsanmi marcsanmi requested a review from korniltsev April 30, 2025 10:32
Copy link
Collaborator

@korniltsev korniltsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@korniltsev korniltsev merged commit b19fa15 into main Apr 30, 2025
9 checks passed
@korniltsev korniltsev deleted the marcsanmi/add-native-profiles-symbolization-label branch April 30, 2025 11:05
simonswine added a commit that referenced this pull request May 1, 2025
korniltsev pushed a commit that referenced this pull request May 2, 2025
korniltsev pushed a commit that referenced this pull request May 2, 2025
* feat: add __has_native_unsymbolized_profiles__ label on ingest path

* Update unsymbolized check logic & update label name

* Fix wrong HasUnsymbolizedProfiles logic and add tests
korniltsev added a commit that referenced this pull request May 2, 2025
simonswine pushed a commit that referenced this pull request May 2, 2025
* chore: add v2 integration tests

* lint

* fix metrics checks

* maybe make v2 tests less flaky

* fix race

* fix segment writer shutdown race

* feat: add __unsymbolized__ label on ingest path (#4147)

* feat: add __has_native_unsymbolized_profiles__ label on ingest path

* Update unsymbolized check logic & update label name

* Fix wrong HasUnsymbolizedProfiles logic and add tests

* Revert "feat: add __unsymbolized__ label on ingest path (#4147)"

This reverts commit eb12a50.

* Revert "fix segment writer shutdown race"

This reverts commit 63b2bd6.

* fix segment writer shutdown race with a per shard WaitGroup

---------

Co-authored-by: Marc Sanmiquel <[email protected]>
marcsanmi added a commit that referenced this pull request May 2, 2025
* feat: add __has_native_unsymbolized_profiles__ label on ingest path

* Update unsymbolized check logic & update label name

* Fix wrong HasUnsymbolizedProfiles logic and add tests
marcsanmi pushed a commit that referenced this pull request May 2, 2025
@korniltsev
Copy link
Collaborator

I think TestIngestOTLP 🔴

It fails if run on go 1.24 if I switch back to 1.23 it does not fail. I think we should run tests on 1.24 for easier future upgrades.

@korniltsev korniltsev requested a review from Copilot May 5, 2025 07:46
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for unsymbolized profiles by introducing a new label on the ingest path and updating several functions to use a datasets map instead of heads.

  • Refactored ingest and flush functions to use a datasets map.
  • Updated logging messages and error messages to reflect the new naming.
  • Added tests and helper functions to evaluate unsymbolized profiles.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
pkg/experiment/ingester/segment.go Renamed heads to datasets and updated related log/error messages.
pkg/experiment/ingester/memdb/head_test.go Added tests for unsymbolized profile handling.
pkg/experiment/ingester/memdb/head.go Introduced HasUnsymbolizedProfiles logic in flush implementation.
pkg/experiment/block/metadata/metadata_labels.go Added new label constant for unsymbolized profiles.

locations := symbols.Locations
mappings := symbols.Mappings
for _, loc := range locations {
if !mappings[loc.MappingId].HasFunctions {
Copy link
Preview

Copilot AI May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a nil check or verifying that mappings[loc.MappingId] exists before accessing its HasFunctions field to avoid potential nil pointer dereferences.

Suggested change
if !mappings[loc.MappingId].HasFunctions {
mapping := mappings[loc.MappingId]
if mapping == nil {
continue
}
if !mapping.HasFunctions {

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants