Skip to content

Add perfinsights middleware which writes the latency of API calls #2383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

josephschorr
Copy link
Member

to a new native histogram bucket

Should help users determine top-line API latency

@josephschorr josephschorr requested a review from a team as a code owner May 7, 2025 20:42
@github-actions github-actions bot added area/CLI Affects the command line area/api v1 Affects the v1 API area/dependencies Affects dependencies labels May 7, 2025
Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Mostly nits!

@@ -0,0 +1,2 @@
// Package perfinsights defines middleware that reports the latency of API calls to Prometheus.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a middleware that does this - the grpc-ecosystem prometheus middleware.
Please clarify what is different about this (e.g. query shapes, more finer grained detail about
the API payload)

@@ -170,6 +180,8 @@ func checkResultToAPITypes(cr *dispatch.ResourceCheckResult) (v1.CheckPermission
}

func (ps *permissionServer) CheckBulkPermissions(ctx context.Context, req *v1.CheckBulkPermissionsRequest) (*v1.CheckBulkPermissionsResponse, error) {
perfinsights.SetInContext(ctx, perfinsights.NoLabels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment clarifying insights are added down the line for the individual check elements

DispatchChunkSize: defaultIfZero(config.DispatchChunkSize, 100),
MaxCheckBulkConcurrency: defaultIfZero(config.MaxCheckBulkConcurrency, 50),
CaveatTypeSet: caveattypes.TypeSetOrDefault(config.CaveatTypeSet),
ExpiringRelationshipsEnabled: config.ExpiringRelationshipsEnabled,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was expiring rels getting enabled by default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only for the relationships APIs, but since it wasn't enabled for schema, it was basically a no-op (except it caused us to request the column)


func labelsForFilter(filter *v1.RelationshipFilter) perfinsights.APIShapeLabels {
if filter == nil {
return perfinsights.APIShapeLabels{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return perfinsights.APIShapeLabels{}
return perfinsights.NoLabels

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed here and everywhere else

@@ -575,3 +592,23 @@ func checkIfFilterIsEmpty(filter *v1.RelationshipFilter) error {

return nil
}

func labelsForFilter(filter *v1.RelationshipFilter) perfinsights.APIShapeLabels {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create unit test

}

// UnaryServerInterceptor returns a gRPC server-side interceptor that provides reporting for Unary RPCs.
func UnaryServerInterceptor(isEnabled bool) grpc.UnaryServerInterceptor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a unit test that mounts this interceptor over a gRPC server and confirms a request generates the metrics

return
}

builder := ctxKey.Value(r.ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: maybe "shapeClosure"?

Comment on lines 117 to 123
if exemplarObserver, ok := o.(prometheus.ExemplarObserver); ok {
if spanCtx := trace.SpanContextFromContext(ctx); spanCtx.HasTraceID() {
traceID := prometheus.Labels{"traceID": spanCtx.TraceID().String()}
exemplarObserver.ObserveWithExemplar(duration.Seconds(), traceID)
return
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also def needs testing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to do end-to-end testing with this

}

o := APIShapeLatency.WithLabelValues(labels...)
if exemplarObserver, ok := o.(prometheus.ExemplarObserver); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to type assert this into a global var, instead of keeping the base interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in a nice way :/

FilterLabel APIShapeLabel = "filter"
)

var allLabels = []string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not a slice of APIShapeLabel? you seem to be type-asserting in the main for loop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We pass it as labels, and those must be strings

to a new native histogram bucket

Should help users determine top-line API latency
@josephschorr
Copy link
Member Author

Updated

Comment on lines +3 to +4
// Unlike the gRPC middleware, the perf insights middleware records API calls by "shape", versus
// aggregating all calls simply by API kind. The shapes allow for more granular determination of
Copy link
Contributor

@miparnisari miparnisari May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: to me, "shape" and "kind" are kind of synonyms. Might be good to be define "shape"

EDIT: ah, it's defined in the code itself 👍 disregard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api v1 Affects the v1 API area/CLI Affects the command line area/dependencies Affects dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants