V3.0.4 is a fast, monitoring-oriented PVsyst report parser.
It is designed to produce a single JSON payload that contains:
- Raw inverter IDs (
INV01,INV02, …) as stable keys - A human-friendly inverter
description - A per-inverter
combined_configurationarray that consolidates MPPT allocation + the config fields you need for monitoring
V3 is implemented in pvsyst_parser.py.
- Text-only parsing (faster): uses
pdfplumberonly (no Camelot/table extraction). - Monitoring-friendly output: per inverter, a single
combined_configurationlist that includes MPPT → config mapping plus config details. - Stable IDs + friendly names: JSON keys remain raw inverter IDs;
descriptionprovides a display label. config_idnaming: MPPT associations referenceconfig_idinstead ofarray_id.- Current handling:
i_mpp_aincombined_configurationis scaled to the MPPT based on strings-per-config and strings-per-MPPT. - Multiple module models (per config/MPPT): module manufacturer/model is tracked per array configuration (and therefore can vary by inverter/MPPT). See
module_types+module_type_idin the output. - Single-configuration fallback: supports reports with no
Array #blocks (one uniform config). - Industry heuristics: can infer MPPT topology for common inverter families:
- SMA Core1: 6 MPPT, max 2 strings/MPPT
- CHINT / CPS: 3 MPPT, max 6 strings/MPPT
- Validation: Cross-validates parsed data against authoritative "Total Inverter Power" sections
- Enhanced Debugging: Detailed warnings for parsing edge cases with array block inspection
- Python 3.9+ recommended
pip install -r requirements.txtNote: V3 uses text-only parsing with pdfplumber for faster, more reliable extraction.
Parse a PVsyst PDF and write outputs (text + JSON) into an output directory:
python3 pvsyst_parser.py "path/to/report.pdf" --output-dir "./out"Outputs:
*_analysis_v3.txt*_structured_v3.json
Optional: generate an additional PowerTrack patch JSON (per inverter):
python3 pvsyst_parser.py "path/to/report.pdf" --powertrack-patchYou can override the output path:
python3 pvsyst_parser.py "path/to/report.pdf" --powertrack-patch \
--powertrack-patch-path "./out/site_powertrack_patch.json"When --powertrack-patch is enabled, the parser writes a second JSON file whose top-level keys are PV0, PV1, ... (PowerTrack-style keys), with one patch object per inverter.
- Default mapping:
INV01->PV0,INV02->PV1, ... (derived from the numeric suffix minus 1). - If an inverter ID has no usable numeric suffix (or would collide), the next available
PV{n}is assigned. - Inverters are processed in sorted order by raw inverter ID, so key assignment is deterministic for a given report.
Each PV{n} entry looks like:
{
"PV0": {
"description": "Inv 01 - (125.0 kW) - SMA Sunny Tripower CORE1",
"pvConfig": {
"inverters": [
{
"numOfStrings": 2,
"panelsPerString": 28,
"wattsPerPanel": 540,
"inverterKw": 125.0,
"azimuth": 180.0,
"tilt": 20.0,
"dcSize": 30.24,
"mppVoltage": 950.0,
"mppAmps": 26.2,
"mppWatts": 24890.0
}
],
"monthlyOutput": {
"jan": 0,
"feb": 0,
"mar": 0,
"apr": 0,
"may": 0,
"jun": 0,
"jul": 0,
"aug": 0,
"sep": 0,
"oct": 0,
"nov": 0,
"dec": 0
},
"degrade": 0.5
}
}
}Notes:
pvConfig.inverters[]is MPPT-level (one entry perinverter_summary[INVxx].combined_configurationrow).monthlyOutputis annual energy split by month in kWh (rounded to integers) with keysjan..dec.degradecomes from the PVsyst Array Losses thermal loss percent when available (units are percent, e.g.0.5means0.5%).mppVoltage/mppAmps/mppWattsare included only when the underlying values are present (nulls are omitted).
V3 API entry point is app.py.
Run on port 8000:
uvicorn app:app --reload --host 0.0.0.0 --port 8000Endpoints:
POST /api/parse(multipart form fieldfile)GET /api/health
Example:
curl -X POST "http://localhost:8000/api/parse" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@your_pvsyst_report.pdf"Notes:
app.pyruns the V3 parse pipeline and returns JSON without writing files.
Start the API server (see above), then open index.html in a browser.
When opened from disk (file://), the UI defaults to http://localhost:8000/api.
Tip: you can override the backend URL with a query param, for example:
index.html?apiBase=http://localhost:8000 (or .../api)
uvicorn app:app --reload --port 8000V3 Interface: Modern UI that displays combined MPPT configurations and detects all inverter types.
metadatapv_moduleinverter(global inverter info from PVsyst equipment table, if present)module_types(distinct PV module types detected)inverter_types(distinct inverter types detected)array_configurations(keyed byconfig_id)associations(keyed by raw inverter ID)inverter_summary(keyed by raw inverter ID)system_monthly_productionsystem_monthly_globhororientations
For each inverter INVxx, look at:
inverter_summary[INVxx].descriptioninverter_summary[INVxx].combined_configuration[]inverter_summary[INVxx].pv_modules(all PV module types feeding that inverter)
Notes:
inverter_summary[INVxx].pv_moduleis only populated when that inverter uses exactly one module type; otherwise usepv_modules.
Each entry in combined_configuration is one MPPT row and includes:
mpptconfig_id- allocation:
strings,modules,dc_kwp - config fields:
tilt,azimuth,modules_in_series,u_mpp_v,i_mpp_a
array_configurations[config_id].i_mpp_arepresents the total current for the full configuration (all strings in parallel).combined_configuration[].i_mpp_arepresents the total current for that MPPT, computed as:
(config_i_mpp_a / config_strings_total) * strings_on_that_mppt
Some PVsyst reports represent a site as one uniform configuration and do not include separate Array #n blocks.
V3 detects this and synthesizes one config_id = "1", then distributes strings across MPPTs using the inferred inverter model topology.
V3 also records these diagnostic fields inside array_configurations["1"]:
inferred_inverters_reportedinferred_inverters_requiredinferred_inverters_used
For reports with single configurations, V3 distributes strings round-robin across all available MPPT endpoints:
- Pattern: INV01-MPPT1, INV02-MPPT1, INV03-MPPT1, ..., INV01-MPPT2, INV02-MPPT2, etc.
- Benefits: Fair distribution, optimal capacity utilization, no false "over-limit" warnings
- Validation: Cross-checks against "Total Inverter Power" section for accuracy
V3 includes built-in validation that compares parsed data against authoritative sections:
- Inverter Count Validation: Matches parsed count with "Total Inverter Power" section
- Warning Output: Clear messages for mismatches with debugging details
- Edge Case Handling: Helps identify regex failures in complex PVsyst formats
pvsyst_parser.py— V3 core parsing logic with validationapp.py— V3 FastAPI web applicationindex.html— V3 web interfacerequirements.txt— DependenciesREADME.md— This file
- pdfplumber: PDF text extraction (primary parsing engine)
- fastapi: Web API framework
- uvicorn: ASGI server
V3 Changes: Removed Camelot table extraction dependency for faster, more reliable text-only parsing.
- Text extraction fails: Ensure pdfplumber can read your PDF
- Web interface not loading: Verify uvicorn is running and port 8888 is accessible
- Inverter count mismatch warnings: Check PDF's "Total Inverter Power" section and array headers for consistency
- Tested with PVsyst V7.x and V8.x reports
- Works with standard PVsyst PDF exports
- V3 Enhancements: Better handling of single-array reports and validation warnings for edge cases
- May require adjustments for heavily customized reports (use validation warnings for debugging)
MIT License (whatever that means)
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
For issues or questions:
- Open an issue on GitHub
- Check the troubleshooting section above
- Ensure your PVsyst PDF is a standard export format