Skip to content

Improve performance of Metabase tail/log.txt #32

@preaction

Description

@preaction

Right now, we generate the Metabase tail/log.txt on the backend every 10 minutes. The process takes 5-8 minutes, resulting in the data being slightly out of date. This isn't a huge problem, except that the Metabase is on two servers, and each server has its own version of the tail log. So, anyone trying to coordinate data could get different data every time.

Getting the list of reports takes mere seconds. Which means that the performance problem must be somewhere outside of the database.

It's possible to make the process faster in a couple ways. The biggest way would be to make finding the CPAN author of the distribution faster. This could involve fixing the CPAN::Testers::Schema::Result::TestReport relationship to the uploads table (right now it's not a relationship at all). Unfortunately, the test_report and uploads table cannot be easily joined since they have different character encodings (so any solution will have to address that). Another possibility would be to grab all the information from the uploads table in a single request (collect the list of dist/versions and execute one query to get the data and build a hash for lookups).

It would be good to profile this code to figure out what's slow before any performance improvements are made (and also to verify the efficacy of any performance improvements). The tail log can be generated by running perl bin/cpantesters-legacy-metabase eval 'app->refresh_tail_log'. Try using NYTProf to profile the code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions