Skip to content

Conversation

@Candice0313
Copy link

Summary

This PR adds a lightweight Python wrapper to support programmatic usage of the Synthea CLI. The wrapper allows users to generate patients from within Python scripts by passing parameters such as state, age, gender, patients, and a custom module. It also parses the FHIR output and returns only valid Patient Bundles.

Related Issue

#951

Motivation

Currently, the only way to run Synthea is by invoking run_synthea from the command line. This wrapper enables a Pythonic interface for running Synthea in notebooks, scripts, and larger data processing workflows. It helps improve integration with Python-based pipelines.

Implementation Details

  • Introduced a SyntheaGenerator class inside scripts/python_wrapper/synthea_wrapper.py
  • Detects platform and selects run_synthea (Unix) or run_synthea.bat (Windows)
  • Accepts and passes standard CLI arguments including -p, -a, -g, and -m
  • Parses FHIR Bundle JSON and returns only those containing Patient entries
  • Automatically clears output/fhir/ before each run
  • Output directory is configurable via a save() method
  • Located inside scripts/python_wrapper/ to avoid interfering with Java build

Testing

Manual testing was done via example.py. Verified that:

  • 2 patients were successfully generated
  • FHIR JSON output was parsed and returned as a list of Bundles
  • Data was saved to a custom output directory using save()

Next Steps (Optional)

  • Add unit tests under scripts/python_wrapper/tests
  • Consider turning the wrapper into a pip-installable package
  • Add user documentation for the wrapper in the main README

Please feel free to suggest structural or naming changes. Happy to revise based on feedback!

@awatson1978
Copy link
Collaborator

Was able to verify by running:

python3 ./scripts/python_wrapper/example.py

Output was the following:

Running with options:
Population: 2
Seed: 1749694248447
Provider Seed:1749694248447
Reference Time: 1749694248447
Location: California
Min Age: 30
Max Age: 40
Gender: F
Modules: 
       > [0 loaded]
2 -- Caitlin552 Howell947 (33 y/o F) Diamond Bar, California  (35736)
1 -- Bok974 Phylis163 Fritsch593 (38 y/o F) San Francisco, California  (40747)
Records: total=2, alive=2, dead=0
RNG=2
Clinician RNG=34482

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.2.1/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD SUCCESSFUL in 10s
4 actionable tasks: 4 executed
Generated 2 patients
Sample patient resourceType: Bundle
Saved files: ['Bok974_Phylis163_Fritsch593_edd905dc-3154-1b8f-f647-da395c750fa9.json', 'Caitlin552_Howell947_14868376-0689-1f04-04a2-b57ecd1711c0.json', 'hospitalInformation1749694248447.json', 'practitionerInformation1749694248447.json']

Why the change to 2 patients by default?

@awatson1978 awatson1978 mentioned this pull request Jun 12, 2025
@awatson1978
Copy link
Collaborator

Not sure that pycache directory should be in there.

@awatson1978
Copy link
Collaborator

awatson1978 commented Jun 12, 2025

#1583 passed checks. I removed the __pycache__ directory, and added a readme.

When #1583 is approved, this PR should automatically close, as @Candice0313 commits are included in that PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants