Skip to content

Add job to automatically generate nimbus performance reports #326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file.
20 changes: 20 additions & 0 deletions jobs/nimbus-perf-reports/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM continuumio/miniconda3
WORKDIR /app

# Install dependencies for gsutil
RUN apt-get update && apt-get install -y \
curl \
gnupg \
&& curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - \
&& echo "deb https://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list \
&& apt-get update && apt-get install -y google-cloud-sdk \
&& apt-get clean

# Create the conda environment and install python dependencies
COPY requirements.txt .
RUN conda create --name nimbusperf python=3.10 && \
conda run -n nimbusperf pip install --no-cache-dir -r requirements.txt

COPY . .
ENV BUCKET_URL="gs://moz-fx-data-prot-nonprod-c3a1-protodash/perf-reports"
CMD ["conda", "run", "--no-capture-output", "-n", "nimbusperf", "/app/entry.sh"]
26 changes: 26 additions & 0 deletions jobs/nimbus-perf-reports/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Nimbus Perf Reports

## Overview

Nimbus Perf Reports is a python project imported from https://github.com/dpalmeiro/telemetry-perf-reports that is designed for analyzing and generating performance reports based on telemetry data. This job will check for any recently finished Nimbus experiments, and automatically generate and publish a performance report for that experiment which covers some basic performance coverage.

## Dependencies

You can install the necessary python dependencies with:

`pip install -r requirements.txt`

This project also requires the [gcloud sdk](https://cloud.google.com/sdk/docs/install) and expects that authentication has already been established.

## Usage

To generate a report locally:

1. Define a config for your experiment. See https://github.com/dpalmeiro/telemetry-perf-reports/tree/main/configs for some examples.
2. `./generate-perf-report --config <path to config>`


To update the existing perf-reports list on protosaur, you can run this through docker:

1. `docker build -t nimbusperf-app .`
2. `docker run -it --rm -v ~/.config/gcloud:/root/.config/gcloud:ro nimbusperf-app`
12 changes: 12 additions & 0 deletions jobs/nimbus-perf-reports/ci_job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
build-nimbus-perf-reports:
docker:
- image: << pipeline.parameters.git-image >>
steps:
- checkout
- compare-branch:
pattern: ^jobs/nimbus-perf-reports/
- setup_remote_docker:
version: << pipeline.parameters.docker-version >>
- run:
name: Build Docker image
command: docker build -t app:build jobs/nimbus-perf-reports
13 changes: 13 additions & 0 deletions jobs/nimbus-perf-reports/ci_workflow.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
nimbus-perf-reports:
jobs:
- build-nimbus-perf-reports
- gcp-gcr/build-and-push-image:
context: data-eng-airflow-gcr
docker-context: jobs/nimbus-perf-reports/
path: jobs/nimbus-perf-reports/
image: nimbus-perf-reports
requires:
- build-nimbus-perf-reports
filters:
branches:
only: main
31 changes: 31 additions & 0 deletions jobs/nimbus-perf-reports/entry.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/bin/bash
set -euxo pipefail
export PATH=/opt/conda/envs/nimbusperf/bin:$PATH

# Clear any pre-existing reports
rm -rf reports

echo -e "\n*************************************************"
echo -e "Copying index.html\n"
gsutil cp $BUCKET_URL/index.html index.html
cp index.html index-backup.html

echo -e "\n*************************************************"
echo -e "Generating reports...\n"
python find-latest-experiment index.html

if [ -d "reports" ] && [ "$(ls -A reports)" ]; then
for file in reports/* ; do
echo -e "\n*************************************************"
echo -e "Updating index.html with $file"
python update-index --index index.html --append $file

echo -e "Transferring $file to protosaur\n"
gsutil cp $file $BUCKET_URL/$(basename $file)
done

echo -e "\n*************************************************"
echo -e "Uploading new index.html to protosaur\n"
gsutil cp index-backup.html $BUCKET_URL/index-backup.html
gsutil cp index.html $BUCKET_URL/index.html
fi
196 changes: 196 additions & 0 deletions jobs/nimbus-perf-reports/find-latest-experiment
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
#!/usr/bin/env python3
import requests
import json
import sys
import os
from lib.generate import generate_report
from datetime import datetime, timedelta
from bs4 import BeautifulSoup as bs

default_histograms = [
"metrics.timing_distribution.performance_pageload_fcp",
"metrics.timing_distribution.performance_pageload_load_time",
"metrics.timing_distribution.perf_largest_contentful_paint",
"metrics.timing_distribution.networking_http_channel_sub_open_to_first_sent_https_rr",

"metrics.timing_distribution.perf_startup_cold_view_app_to_first_frame",
"metrics.timing_distribution.perf_startup_cold_unknwn_app_to_first_frame",
"metrics.timing_distribution.perf_startup_cold_main_app_to_first_frame",
"metrics.timing_distribution.perf_startup_application_on_create",
"metrics.timing_distribution.geckoview_startup_runtime",

"metrics.timing_distribution.performance_interaction_tab_switch_composite",
"metrics.timing_distribution.recent_synced_tabs_recent_synced_tab_time_to_load",
"metrics.timing_distribution.perf_awesomebar_search_engine_suggestions"
]

default_events = {
"fcp_time" : [0, 30000],
"lcp_time" : [0, 30000],
"load_time": [0, 30000],
"response_time": [0, 30000]
}

class NpEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.integer):
return int(obj)
if isinstance(obj, np.floating):
return float(obj)
if isinstance(obj, np.ndarray):
return obj.tolist()
return super(NpEncoder, self).default(obj)

# Only generate reports for Desktop or Android experiments.
def is_supported_experiment(exp):
if not (exp['appName'] == 'firefox_desktop' or exp['appName'] == 'fenix'):
print("--> unsupported platform.")
return False

# Skip experiments with no branches
if len(exp['branches']) == 0:
print("--> no branches found.")
return False

# If this is an experiment with only 1 branch, then pretend it's a rollout.
if not exp['isRollout'] and len(exp['branches']) == 1:
exp['isRollout'] = True

# Cannot generate a performance report for rollouts that use 100% of population.
if exp['isRollout'] and len(exp['branches']) == 1 and exp['branches'][0]['ratio'] >= 0.9:
print("--> no control population available.")
return False

return True

# Check if the experiment finished recently.
def is_recent_experiment(date_string, days=3):
given_date = datetime.strptime(date_string, "%Y-%m-%d")
now = datetime.now()

# Check if the given date is within the last 3 days
days_ago = now - timedelta(days)
return days_ago <= given_date

def filter_and_sort(experiments):
# Remove invalid entries (those with None as endDate)
experiments[:] = [exp for exp in experiments if exp["endDate"] is not None]

# Sort the remaining entries by endDate
experiments.sort(key=lambda x: x["endDate"])

def retrieve_nimbus_experiment_list():
url=f'https://experimenter.services.mozilla.com/api/v6/experiments/'
print(f"Loading nimbus experiment list from {url}")

response = requests.get(url)
if response.ok:
values = response.json()
return values
else:
print(f"Failed to retrieve {url}: {response.status_code}")
sys.exit(1)

def extract_existing_reports(index_file):
with open(index_file, 'r') as file:
soup = bs(file, 'html.parser')

# Find the table containing experiment reports
experiment_table = soup.find('table', class_='experiment-table')
experiments = {}

if experiment_table:
rows = experiment_table.find_all('tr')[1:] # Skip the header row
for row in rows:
cells = row.find_all('td')
if cells and len(cells) > 0:
experiment_name = cells[0].get_text(strip=True)
experiments[experiment_name] = {
'start_date': cells[1].get_text(strip=True),
'end_date': cells[2].get_text(strip=True),
'channel': cells[3].get_text(strip=True)
}

return experiments

def generate_histogram_metrics(exp):
return default_histograms

def generate_event_metrics(exp):
return default_events

# Create a config for the experiment, and return a dict of
# args used to generate the experiment report.
def create_config_for_experiment(exp):
args = {}
config = {}
config['slug'] = exp['slug']

if exp['appName'] == 'firefox_desktop':
config['segments'] = ['Windows', 'Linux', 'Mac']
elif exp['appName'] == 'fenix':
config['segments'] = ['Android']

config['histograms'] = generate_histogram_metrics(exp)
config['pageload_event_metrics'] = generate_event_metrics(exp)

configFile = f"{exp['slug']}.json"
with open(configFile, 'w') as f:
json.dump(config, f, indent=2, cls=NpEncoder)

class ArgsDict(dict):
def __getattr__(self, name):
return self[name]
def __setattr__(self, name, value):
self[name] = value

args = ArgsDict()
args.config = configFile
args.dataDir = 'data'
args.reportDir = 'reports'
args.skip_cache = False
args.html_report = True
return args

def main():
if len(sys.argv) < 2:
print("Error: Please provide path to existing reports index.html file.")
sys.exit(1)

index_file = sys.argv[1]
if not os.path.isfile(index_file):
print(f"Error: Cannot find '{index_file}'")
sys.exit(1)

# Get list of reports already created by slug
reports = extract_existing_reports(index_file)

# Get experiment list
experiments = retrieve_nimbus_experiment_list()

# Sort list by endDate
filter_and_sort(experiments)

for exp in experiments:
print("Checking ", exp['slug'], "...")

if not is_recent_experiment(exp['endDate']):
print("---> not recent")
continue

if not is_supported_experiment(exp):
continue

if exp['slug'] in reports:
print("---> already exists")
continue

print('---------------------------')
print(f"Generating Report for {exp['slug']}")
print("Config:")
print(json.dumps(exp, indent=2))
args = create_config_for_experiment(exp)
generate_report(args)

if __name__ == "__main__":
main()
21 changes: 21 additions & 0 deletions jobs/nimbus-perf-reports/generate-perf-report
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/usr/bin/env python3
import os
import sys
import argparse
from lib.generate import generate_report

def parseArguments():
parser = argparse.ArgumentParser(description='Process telemetry performance report.')
parser.add_argument('--config', type=str, required=True, help="Input JSON config file.")
parser.add_argument('--dataDir', type=str, default="data", help="Directory to save data to.")
parser.add_argument('--reportDir', type=str, default="reports", help="Directory to save results to.")
parser.add_argument('--skip-cache', action=argparse.BooleanOptionalAction,
default=False, help="Ignore any cached files on disk, and regenerate them.")
parser.add_argument('--html-report', action=argparse.BooleanOptionalAction,
default=True, help="Generate html report.")
args = parser.parse_args()
return args

if __name__ == "__main__":
args = parseArguments()
generate_report(args)
Loading