-
Notifications
You must be signed in to change notification settings - Fork 16
feat(metrics): add /metrics which returns prometheus metrics #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8ec058c
to
551d454
Compare
f"inf_batch_current_size {batch_current_size}\n" + | ||
f"inf_queue_size {queue_size}\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure the naming is the best, I copied TGI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we okay that this will always be 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it bother you or makes it unclear for customers I can remove inf_batch_current_size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, maybe let's wait and see if @philschmid agrees on the metrics reported and we can merge 🤗 Thanks!
551d454
to
05868e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
f"inf_batch_current_size {batch_current_size}\n" + | ||
f"inf_queue_size {queue_size}\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we okay that this will always be 1?
Will be useful for Dedicated Endpoints to improve request forwarding fairness and scaling