Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve master_commit_red query performance #6174

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

clee2000
Copy link
Contributor

@clee2000 clee2000 commented Jan 15, 2025

Pre filter the commits so we can filter the workflow job and workflow run tables on it later

This improves speed for all time ranges up to 1 year. I did not check beyond that
I believe the memory used is about the same, but it scans more rows for some reason

This is the query behind this chart
image
on the metrics page

Copy link

vercel bot commented Jan 15, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
torchci ✅ Ready (Inspect) Visit Preview Jan 15, 2025 7:12pm

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 15, 2025
@clee2000 clee2000 marked this pull request as ready for review January 15, 2025 19:27
@clee2000 clee2000 requested a review from a team January 15, 2025 19:28
workflow_job job FINAL
JOIN workflow_run FINAL ON workflow_run.id = workflow_job.run_id
JOIN push FINAL ON workflow_run.head_commit.'id' = push.head_commit.'id'
default.workflow_job job final join all_runs workflow_run on workflow_run.id = workflow_job.run_id
Copy link
Contributor

@huydhn huydhn Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads a bit confusing IMO. If I read it correctly, the all_runs table has the alias as workflow_run, which has the correct syntax. But I always think of workflow_run as the workflow_run table instead of having it as an alias. So, it feels easier just to call it all_runs

Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only see the improvement on https://hud.pytorch.org/query_execution_metrics after this lands. I wonder if there is a way to get the information for the new query at PR time. Let's chat more on this when you're back

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants