Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,15 @@ Installs only the lightweight core tools from `core/opl/` to minimize dependency
source venv/bin/activate
python3 -m pip install --no-cache-dir -e "git+https://github.com/redhat-performance/opl.git#egg=opl-rhcloud-perf-team-core&subdirectory=core"

PostgreSQL support in `pass_or_fail.py` (history/decisions plugins) is optional.
`psycopg2-binary` is not included in the core install. Selecting a PostgreSQL plugin
raises `ModuleNotFoundError` at runtime if the package is missing. Install one of:

pip install psycopg2-binary
pip install "git+https://github.com/redhat-performance/opl.git#egg=opl-rhcloud-perf-team-core[postgresql]&subdirectory=core"

Or use the full installation below, which pulls in extras (including `psycopg2-binary`).

**Extras Installation:**

Note: Do you really need to do this? If you do full installation, it does exactly this.
Expand Down
76 changes: 61 additions & 15 deletions core/opl/investigator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,17 @@ the same test and decide if new test result is PASS or FAIL.

You can configure multiple things as of now:

1. How to get historical results of the test (supports ElasticSearch, CSV
and directory of JSON files)
3. How to load new result (support just JSON file)
4. What method to use to actually find if new result is out of safe bounds
(we mostly use `if new result is biggeer than max or smaller than min
1. How to get historical results of the test (supports ElasticSearch,
PostgreSQL, CSV and directory of JSON files)
2. How to load new result (support just JSON file)
3. What method to use to actually find if new result is out of safe bounds
(we mostly use `if new result is bigger than max or smaller than min
of historical data, it is FAIL`, but it is easy to implement more)
6. What metrics from the JSONs to compare (e.g. `results.rps`,
4. What metrics from the JSONs to compare (e.g. `results.rps`,
`monitoring.pod.cpu.mean` and `monitoring.pod.memory.mean`)
8. Optionally you can also configure where to store metadata about
5. Optionally you can also configure where to store metadata about
decision the script done. This is useful to keep track about trends
(supports ElasticSearch and CSV)
(supports ElasticSearch, PostgreSQL and CSV)

See `sample_config.yaml` for example configuration. This is what each
section is for:
Expand All @@ -43,7 +43,7 @@ That query is Jinja2 template-able, so you can include:
to dynamically obtain the value from new result and use it to filter for
historical results.

There are other plugins you can use to retrieve distorical data:
There are other plugins you can use to retrieve historical data:

* `elasticsearch` - Retrieves historical data from ElasticSearch
and is described above
Expand All @@ -58,6 +58,32 @@ There are other plugins you can use to retrieve distorical data:
run-2022-01-09T04:20:04+00:00,Test XYZ,151
run-2022-01-11T01:16:47+00:00,Test XYZ,144

* `postgresql` - Retrieves historical data from a PostgreSQL database.
The SQL query should return rows where the first column is a JSON/JSONB
document (a status data document). The query is Jinja2 template-able
just like `es_query`. Example:

type: postgresql
pg_host: db.example.com
pg_port: 5432
pg_database: perf_results
pg_user: reader
pg_password_env_var: OPL_DB_PASSWORD
pg_query: |
SELECT data FROM results
WHERE data->>'name' = '{{ current.get("name") }}'
ORDER BY (data->>'started')::timestamptz DESC
LIMIT 30

Requires the optional PostgreSQL dependency (not installed with core by default).
Install one of:

pip install psycopg2-binary
pip install "opl-rhcloud-perf-team-core[postgresql] @ ..." # core-only install
pip install git+https://github.com/redhat-performance/opl.git # full install (includes extras)

Without the optional dependency installed, using the PostgreSQL plugin raises `ModuleNotFoundError` at runtime.

* `sd_dir` - directory with status data files from past experiments
which allows filtering by matching various fields before loading data.
Below is example where we load data from SD files whose `name` matches
Expand All @@ -75,15 +101,15 @@ There are other plugins you can use to retrieve distorical data:
This specifies from where we should load new (`current`) test result
we will be evaluating.

There is only choice now that loads current result from status data file.
There is only one choice now that loads current result from status data file.

This can be overwriten by `--current-file` command line option.
This can be overwritten by `--current-file` command-line option.


`methods:`
----------

Alows you to specify list of checks you want to use to check results.
Allows you to specify list of checks you want to use to check results.
These checks are defined in `check.py`. Impractical example:

methods:
Expand Down Expand Up @@ -121,9 +147,9 @@ not represent test result comparable across historical runs - this
timestamp is simply always different, based on when the test was running,
so it does not make sense to compare it across historical results).

It also allws you to define what check methods (other then these defined in
It also allows you to define what check methods (other than these defined in
default `methods:` should apply to this metric. You can also provide
additional possitional args if the check method needs them.
additional positional args if the check method needs them.

Expected data structure looks like this:

Expand Down Expand Up @@ -165,5 +191,25 @@ As of now you can use these decisions storage plugins:
Kibana you can have investigation of decision trends dashboards or so.
* `csv` - stores all the decisions for current test in a CSV file
(overwritten every time the tool is invoked)
* `postgresql` - stores each decision as a JSON document in a PostgreSQL
table. The table should have a `data` column of type `JSON` or `JSONB`.
Example:

type: postgresql
pg_host: db.example.com
pg_port: 5432
pg_database: perf_results
pg_table: decisions
pg_user: writer
pg_password_env_var: OPL_DB_PASSWORD

Requires the optional PostgreSQL dependency (not installed with core by default).
Install one of:

pip install psycopg2-binary
pip install "opl-rhcloud-perf-team-core[postgresql] @ ..." # core-only install
pip install git+https://github.com/redhat-performance/opl.git # full install (includes extras)

Without the optional dependency installed, using the PostgreSQL plugin raises `ModuleNotFoundError` at runtime.

This can be turned off with `--dry-run` command line option.
This can be turned off with `--dry-run` command-line option.
35 changes: 35 additions & 0 deletions core/opl/investigator/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,17 @@ def render_query(args, template_data):
args.history_es_query = yaml.load(rendered, Loader=yaml.SafeLoader)


def render_pg_query(args, template_data):
logging.debug(
f"Rendering Jinja2 template pg_query {args.history_pg_query} with data {template_data}"
)
env = jinja2.Environment(loader=jinja2.DictLoader({"query": args.history_pg_query}))
template = env.get_template("query")
rendered = template.render(template_data)
logging.debug(f"Rendered Jinja2 template pg_query {rendered}")
args.history_pg_query = rendered


def render_matchers(args, template_data):
logging.debug(
f"Rendering Jinja2 template matchers {args.history_matchers} with data {template_data}"
Expand All @@ -111,6 +122,8 @@ def load_config_finish(args, sd):
render_sets(args, template_data)
if args.history_type == "elasticsearch":
render_query(args, template_data)
if args.history_type == "postgresql":
render_pg_query(args, template_data)
if args.history_type == "sd_dir":
render_matchers(args, template_data)

Expand Down Expand Up @@ -146,6 +159,16 @@ def load_config(conf, fp):
else:
conf.history_es_server_verify = True

if conf.history_type == "postgresql":
conf.history_pg_host = data["history"]["pg_host"]
conf.history_pg_port = data["history"].get("pg_port", 5432)
conf.history_pg_database = data["history"]["pg_database"]
conf.history_pg_query = data["history"]["pg_query"]
if "pg_user" in data["history"]:
conf.history_pg_user = data["history"]["pg_user"]
if "pg_password_env_var" in data["history"]:
conf.history_pg_password_env_var = data["history"]["pg_password_env_var"]

if conf.history_type == "sd_dir":
conf.history_dir = data["history"]["dir"]
conf.history_matchers = data["history"]["matchers"]
Expand All @@ -168,6 +191,18 @@ def load_config(conf, fp):
else:
conf.decisions_es_server_verify = True

if conf.decisions_type == "postgresql":
conf.decisions_pg_host = data["decisions"]["pg_host"]
conf.decisions_pg_port = data["decisions"].get("pg_port", 5432)
conf.decisions_pg_database = data["decisions"]["pg_database"]
conf.decisions_pg_table = data["decisions"]["pg_table"]
if "pg_user" in data["decisions"]:
conf.decisions_pg_user = data["decisions"]["pg_user"]
if "pg_password_env_var" in data["decisions"]:
conf.decisions_pg_password_env_var = data["decisions"][
"pg_password_env_var"
]

if conf.decisions_type == "csv":
conf.decisions_filename = data["decisions"].get(
"file", data["decisions"].get("filename")
Expand Down
80 changes: 80 additions & 0 deletions core/opl/investigator/postgresql_decisions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
import datetime
import json
import logging
import os
import re

_SQL_IDENTIFIER = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]*$")


def _validate_sql_identifier(name, kind="identifier"):
if not _SQL_IDENTIFIER.fullmatch(name):
raise ValueError(f"Invalid PostgreSQL {kind}: {name!r}")


def store(pg_host, pg_port, pg_database, table, decisions, **kwargs):
try:
import psycopg2
except ImportError as exc:
raise ImportError(
"PostgreSQL support requires psycopg2-binary, which is not included in "
"the core install. Run 'pip install psycopg2-binary', install core with "
"the [postgresql] extra, or install the full opl package (includes extras)."
) from exc

_validate_sql_identifier(table, "table name")

pg_user = kwargs.get("pg_user")
pg_password_env_var = kwargs.get("pg_password_env_var")

db_conf = {
"host": pg_host,
"port": pg_port,
"database": pg_database,
}
if pg_user:
db_conf["user"] = pg_user
if pg_password_env_var:
db_conf["password"] = os.environ.get(pg_password_env_var)

job_name = os.environ.get("JOB_NAME", "")
build_url = os.environ.get("BUILD_URL", "")

try:
connection = psycopg2.connect(**db_conf)
except psycopg2.Error as exc:
logging.warning(
f"Failed to connect to PostgreSQL {pg_host}:{pg_port}/{pg_database}: {exc}"
)
return

cursor = connection.cursor()

try:
for decision in decisions:
decision["job_name"] = job_name
decision["build_url"] = build_url
decision["uploaded"] = datetime.datetime.now(
tz=datetime.timezone.utc
).isoformat()

logging.info(
f"Storing decision to PostgreSQL {pg_host}:{pg_port}/{pg_database} table={table} json={json.dumps(decision)}"
)
try:
cursor.execute("SAVEPOINT store_decision")
cursor.execute(
f"INSERT INTO {table} (data) VALUES (%s)",
[json.dumps(decision)],
)
cursor.execute("RELEASE SAVEPOINT store_decision")
except psycopg2.Error as exc:
logging.warning(f"Failed to store decision to PostgreSQL: {exc}")
cursor.execute("ROLLBACK TO SAVEPOINT store_decision")

connection.commit()
except psycopg2.Error as exc:
logging.warning(f"Failed to commit decisions to PostgreSQL: {exc}")
finally:
cursor.close()
connection.close()
60 changes: 60 additions & 0 deletions core/opl/investigator/postgresql_loader.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import logging
import os
import tempfile

import opl.status_data


def load(pg_host, pg_port, pg_database, query, paths, **kwargs):
try:
import psycopg2
except ImportError as exc:
raise ImportError(
"PostgreSQL support requires psycopg2-binary, which is not included in "
"the core install. Run 'pip install psycopg2-binary', install core with "
"the [postgresql] extra, or install the full opl package (includes extras)."
) from exc

pg_user = kwargs.get("pg_user")
pg_password_env_var = kwargs.get("pg_password_env_var")

db_conf = {
"host": pg_host,
"port": pg_port,
"database": pg_database,
}
if pg_user:
db_conf["user"] = pg_user
if pg_password_env_var:
db_conf["password"] = os.environ.get(pg_password_env_var)

out = {}

for path in paths:
out[path] = []

logging.info(
f"Querying PostgreSQL on {pg_host}:{pg_port}/{pg_database} with query={query}"
)

connection = psycopg2.connect(**db_conf)
cursor = connection.cursor()
cursor.execute(query)

for row in cursor:
data = row[0]
logging.debug(
f"Loading data from row with id={data.get('id', None)} name={data.get('name', None)}"
)
tmpfile = tempfile.NamedTemporaryFile(delete=False).name
sd = opl.status_data.StatusData(tmpfile, data=data)
for path in paths:
tmp = sd.get(path)
if tmp is not None:
out[path].append(tmp)

cursor.close()
connection.close()

logging.debug(f"Loaded {out}")
return out
22 changes: 22 additions & 0 deletions core/opl/investigator/sample_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,19 @@ history:
# matchers: |
# name: "{{ current.get('name') }}"

# type: postgresql
# Requires psycopg2-binary (not in core): pip install psycopg2-binary
# pg_host: db.example.com
# pg_port: 5432
# pg_database: perf_results
# pg_user: reader
# pg_password_env_var: OPL_DB_PASSWORD
# pg_query: |
# SELECT data FROM results
# WHERE data->>'name' = '{{ current.get("name") }}'
# ORDER BY (data->>'started')::timestamptz DESC
# LIMIT 30

type: elasticsearch
es_server: http://elasticsearch.example.com:9286
es_index: my-index
Expand Down Expand Up @@ -50,6 +63,15 @@ decisions:
# type: csv
# filename: /tmp/decisions.csv

# type: postgresql
# Requires psycopg2-binary (not in core): pip install psycopg2-binary
# pg_host: db.example.com
# pg_port: 5432
# pg_database: perf_results
# pg_table: decisions
# pg_user: writer
# pg_password_env_var: OPL_DB_PASSWORD

type: elasticsearch
es_server: http://elasticsearch.example.com:9286
es_index: my_aa_decisions
Loading
Loading