Add perf metrics collection, TCO estimator, and JMeter load-test wrapper by sidd190 · Pull Request #142 · openMF/mifos-gazelle

sidd190 · 2026-05-12T18:01:11Z

Summary

Adds a self-contained performance/resource metrics and total cost of ownership (TCO) estimation toolkit for Mifos Gazelle, aligned with the current codebase and deployment model.

Addresses the spirit of GAZ-5: give operators and contributors a repeatable way to snapshot what the stack consumes on a real cluster and produce directional cloud cost estimates from those snapshots.

Evidence Artifacts from the cloud VM that was run on GCP, and the metrics generated!
metrics-after.json
metrics-before.json
tco-result.json
summary.txt
live-metrics.json

---

What was added

Path	Purpose
`src/utils/perf/collect-metrics.sh`	Collects per-pod CPU/memory via `kubectl top pods` for Gazelle namespaces (`infra`, `mifosx`, `paymenthub`, `vnext`). Optional `--storage` flag sums requested PVC capacity per namespace. Writes a single JSON report. Supports `--mock` for local testing without a cluster.
`src/utils/perf/tco-estimate.py`	Reads the metrics JSON and estimates monthly/annual cost using an embedded instance catalog (with optional `--pricing-file`), configurable `--egress-gib`, `--topology` (`single-node` \| `ha-3node`), and `--headroom`. Supports `--json-out` and `--all-providers` comparison.
`src/utils/perf/run-load-test.sh`	Headless wrapper around `performance-testing/paymentHubEE.jmx`. Runs JMeter with parameterized threads/duration/host, and invokes `collect-metrics.sh --storage` before/after when snapshots are enabled. Supports `--mock` dry-run.
`docs/PERF-TCO.md`	Usage guide, prerequisites (including JDK for JMeter), data provenance, instructions for exporting evidence from a test VM, and interpretation notes.

Implementation notes

gawk / Ubuntu compatibility: collect-metrics.sh avoids using namespace as an awk variable name (reserved in gawk), ensuring kubectl top output parses correctly on Ubuntu 22.04+.
Load-test snapshots call collect-metrics.sh --storage so TCO inputs can use measured PVC requests when the cluster is reachable.

Quick start

See docs/PERF-TCO.md for full details. Typical flow on a machine with kubectl pointed at a Gazelle cluster:

# Collect metrics
bash src/utils/perf/collect-metrics.sh --storage --out /tmp/gazelle-metrics.json

# Estimate TCO
python3 src/utils/perf/tco-estimate.py \
  --metrics /tmp/gazelle-metrics.json \
  --topology ha-3node \
  --egress-gib 25

# Run load test with before/after snapshots
bash src/utils/perf/run-load-test.sh --threads 10 --duration 120 --out /tmp/lt

# Estimate TCO from post-load snapshot
python3 src/utils/perf/tco-estimate.py \
  --metrics /tmp/lt/metrics-after.json \
  --topology ha-3node \
  --egress-gib 25

Measured inputs vs. modeled costs

Note: Dollar amounts are not read from a cloud bill. They are a model built from cluster snapshots plus assumptions — suitable for planning and comparison, not procurement or finance sign-off without refreshed pricing.

Grounded in real cluster data (when run live):

CPU and memory totals and per-namespace rollups from kubectl top (point-in-time usage, not guaranteed limits)
With --storage: per-namespace storage_gib from summed PVC requested sizes (not bytes used on disk)

Modeled or user-supplied:

Instance type and hourly rate from the built-in catalog or --pricing-file (indicative on-demand Linux prices; pricing_as_of / source are surfaced in output)
Compute: hourly_rate × 730 h/month × topology multiplier (ha-3node uses a 3× planning multiplier)
Storage: requested GiB × per-GiB/month rate × topology multiplier
Network: --egress-gib × per-GiB egress rate (defaults are conservative placeholders)
Per-DPG cost lines: heuristic allocation by each component's share of measured memory across namespaces

Caveats and known limitations

kubectl top reflects current usage; short spikes may not be captured, and limits/requests differ from usage
PVC "storage" is requested capacity from the API, not actual disk consumption
TCO omits real-world line items: support, backups, cross-AZ traffic, load balancers, logging sinks, discounts, committed use, etc.
paymentHubEE.jmx is unchanged in this PR. On a real Gazelle deployment, requests may fail at the HTTP layer (auth, paths, tenant config) until the plan is aligned with current PHEE ingress and APIs. The wrapper, snapshots, and JTL/HTML reporting still validate the pipeline end-to-end.

Testing

collect-metrics.sh --mock
Live run on a GCP Ubuntu VM (e2-standard-4) with full Gazelle deploy; JSON output includes mode: "live", namespaces, and storage_gib when --storage is used
tco-estimate.py on real metrics JSON; --all-providers and --json-out exercised
run-load-test.sh end-to-end with JMeter + JDK; JMeter samples executed; HTTP failures documented as JMX/config gap, not script failure
run-load-test.sh --mock dry-run without JMeter

Follow-up (separate PR)

JMeter plan maintenance: Update or supplement performance-testing/paymentHubEE.jmx (and/or document required JMeter properties) so a default run against a standard Gazelle install achieves a meaningful success rate without manual credential configuration. Requires input from PHEE maintainers on supported public test endpoints and safe demo credentials.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add perf metrics collection, TCO estimator, and JMeter load-test wrapper#142

Add perf metrics collection, TCO estimator, and JMeter load-test wrapper#142
sidd190 wants to merge 3 commits into
openMF:devfrom
sidd190:perf-tco-tools

sidd190 commented May 12, 2026 •

edited

Loading

Uh oh!

tdaly-mifos commented May 12, 2026

Uh oh!

tdaly61 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sidd190 commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What was added

Implementation notes

Quick start

Measured inputs vs. modeled costs

Caveats and known limitations

Testing

Follow-up (separate PR)

Related

Uh oh!

tdaly-mifos commented May 12, 2026

Uh oh!

tdaly61 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sidd190 commented May 12, 2026 •

edited

Loading