|
10 | 10 |
|
11 | 11 | </div>
|
12 | 12 |
|
13 |
| -Simplify testing and evaluating AI and NLP applications in a healthcare context 💫 🏥. |
| 13 | +Simplify developing, testing and validating AI and NLP applications in a healthcare context 💫 🏥. |
14 | 14 |
|
15 |
| -Building applications that integrate in healthcare systems is complex, and so is designing reliable, reactive algorithms involving unstructured data. Let's try to change that. |
| 15 | +Building applications that integrate with electronic health record systems (EHRs) is complex, and so is designing reliable, reactive algorithms involving unstructured data. Let's try to change that. |
16 | 16 |
|
17 | 17 | ```bash
|
18 | 18 | pip install healthchain
|
19 | 19 | ```
|
20 |
| -First time here? Check out our [Docs](dotimplement.github.io/HealthChain/) page! |
| 20 | +First time here? Check out our [Docs](https://dotimplement.github.io/HealthChain/) page! |
21 | 21 |
|
22 | 22 | ## Features
|
23 |
| -- [x] 🍱 Create sandbox servers and clients that comply with real EHRs API and data standards. |
24 |
| -- [x] 🗃️ Generate synthetic FHIR resources or load your own data as free-text. |
25 |
| -- [x] 💾 Save generated request and response data for each sandbox run. |
26 |
| -- [x] 🎈 Streamlit dashboard to inspect generated data and responses. |
27 |
| -- [x] 🧪 Experiment with LLMs in an end-to-end HL7-compliant pipeline from day 1. |
| 23 | +- [x] 🛠️ Build custom pipelines or use [pre-built ones](https://dotimplement.github.io/HealthChain/reference/pipeline/pipeline/#prebuilt) for your healthcare NLP and ML tasks |
| 24 | +- [x] 🏗️ Add built-in CDA and FHIR parsers to connect your pipeline to interoperability standards |
| 25 | +- [x] 🧪 Test your pipelines in full healthcare-context aware [sandbox](https://dotimplement.github.io/HealthChain/reference/sandbox/sandbox/) environments |
| 26 | +- [x] 🗃️ Generate [synthetic healthcare data](https://dotimplement.github.io/HealthChain/reference/utilities/data_generator/) for testing and development |
| 27 | +- [x] 🚀 Deploy sandbox servers locally with [FastAPI](https://fastapi.tiangolo.com/) |
28 | 28 |
|
29 | 29 | ## Why use HealthChain?
|
30 |
| -- **Scaling EHR integrations is a manual and time-consuming process** - HealthChain abstracts away complexities so you can focus on AI development, not EHR configurations. |
31 |
| -- **Evaluating the behaviour of AI in complex systems is a difficult and labor-intensive task** - HealthChain provides a framework to test the real-world resilience of your whole system, not just your models. |
32 |
| -- **[Most healthcare data is unstructured](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372467/)** - HealthChain is optimised for real-time AI/NLP applications that deal with realistic healthcare data. |
| 30 | +- **EHR integrations are manual and time-consuming** - HealthChain abstracts away complexities so you can focus on AI development, not EHR configurations. |
| 31 | +- **It's difficult to track and evaluate multiple integration instances** - HealthChain provides a framework to test the real-world resilience of your whole system, not just your models. |
| 32 | +- [**Most healthcare data is unstructured**](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372467/) - HealthChain is optimized for real-time AI and NLP applications that deal with realistic healthcare data. |
33 | 33 | - **Built by health tech developers, for health tech developers** - HealthChain is tech stack agnostic, modular, and easily extensible.
|
34 | 34 |
|
35 |
| -## Clinical Decision Support (CDS) |
| 35 | +## Pipeline |
| 36 | +Pipelines provide a flexible way to build and manage processing pipelines for NLP and ML tasks that can easily interface with parsers and connectors to integrate with EHRs. |
| 37 | + |
| 38 | +### Building a pipeline |
| 39 | + |
| 40 | +```python |
| 41 | +from healthchain.io.containers import Document |
| 42 | +from healthchain.pipeline import Pipeline |
| 43 | +from healthchain.pipeline.components import TextPreProcessor, Model, TextPostProcessor |
| 44 | + |
| 45 | +# Initialize the pipeline |
| 46 | +nlp_pipeline = Pipeline[Document]() |
| 47 | + |
| 48 | +# Add TextPreProcessor component |
| 49 | +preprocessor = TextPreProcessor(tokenizer="spacy") |
| 50 | +nlp_pipeline.add(preprocessor) |
| 51 | + |
| 52 | +# Add Model component (assuming we have a pre-trained model) |
| 53 | +model = Model(model_path="path/to/pretrained/model") |
| 54 | +nlp_pipeline.add(model) |
| 55 | + |
| 56 | +# Add TextPostProcessor component |
| 57 | +postprocessor = TextPostProcessor( |
| 58 | + postcoordination_lookup={ |
| 59 | + "heart attack": "myocardial infarction", |
| 60 | + "high blood pressure": "hypertension" |
| 61 | + } |
| 62 | +) |
| 63 | +nlp_pipeline.add(postprocessor) |
| 64 | + |
| 65 | +# Build the pipeline |
| 66 | +nlp = nlp_pipeline.build() |
| 67 | + |
| 68 | +# Use the pipeline |
| 69 | +result = nlp(Document("Patient has a history of heart attack and high blood pressure.")) |
| 70 | + |
| 71 | +print(f"Entities: {result.entities}") |
| 72 | +``` |
| 73 | +### Using pre-built pipelines |
| 74 | + |
| 75 | +```python |
| 76 | +from healthchain.io.containers import Document |
| 77 | +from healthchain.pipeline import MedicalCodingPipeline |
| 78 | + |
| 79 | +# Load the pre-built MedicalCodingPipeline |
| 80 | +pipeline = MedicalCodingPipeline.load("./path/to/model") |
| 81 | + |
| 82 | +# Create a document to process |
| 83 | +result = pipeline(Document("Patient has a history of myocardial infarction and hypertension.")) |
| 84 | + |
| 85 | +print(f"Entities: {result.entities}") |
| 86 | +``` |
| 87 | + |
| 88 | +## Sandbox |
| 89 | + |
| 90 | +Sandboxes provide a staging environment for testing and validating your pipeline in a realistic healthcare context. |
| 91 | + |
| 92 | +### Clinical Decision Support (CDS) |
36 | 93 | [CDS Hooks](https://cds-hooks.org/) is an [HL7](https://cds-hooks.hl7.org) published specification for clinical decision support.
|
37 | 94 |
|
38 | 95 | **When is this used?** CDS hooks are triggered at certain events during a clinician's workflow in an electronic health record (EHR), e.g. when a patient record is opened, when an order is elected.
|
39 | 96 |
|
40 |
| -**What information is sent**: the context of the event and FHIR resources that are requested by your service, for example, the patient ID and information on the encounter and conditions they are being seen for. |
| 97 | +**What information is sent**: the context of the event and [FHIR](https://hl7.org/fhir/) resources that are requested by your service, for example, the patient ID and information on the encounter and conditions they are being seen for. |
41 | 98 |
|
42 | 99 | **What information is returned**: “cards” displaying text, actionable suggestions, or links to launch a [SMART](https://smarthealthit.org/) app from within the workflow.
|
43 | 100 |
|
44 |
| -**What you need to decide**: What data do I want my EHR client to send, and how will my service process this data. |
45 |
| - |
46 | 101 |
|
47 | 102 | ```python
|
48 | 103 | import healthchain as hc
|
49 | 104 |
|
| 105 | +from healthchain.pipeline import Pipeline |
50 | 106 | from healthchain.use_cases import ClinicalDecisionSupport
|
51 | 107 | from healthchain.models import Card, CdsFhirData, CDSRequest
|
52 |
| -from healthchain.data_generator import DataGenerator |
53 |
| - |
| 108 | +from healthchain.data_generator import CdsDataGenerator |
54 | 109 | from typing import List
|
55 | 110 |
|
56 |
| -# Decorate class with sandbox and pass in use case |
57 | 111 | @hc.sandbox
|
58 |
| -class myCDS(ClinicalDecisionSupport): |
| 112 | +class MyCDS(ClinicalDecisionSupport): |
59 | 113 | def __init__(self) -> None:
|
60 |
| - self.data_generator = DataGenerator() |
| 114 | + self.pipeline = Pipeline.load("./path/to/model") |
| 115 | + self.data_generator = CdsDataGenerator() |
61 | 116 |
|
62 | 117 | # Sets up an instance of a mock EHR client of the specified workflow
|
63 | 118 | @hc.ehr(workflow="patient-view")
|
64 | 119 | def ehr_database_client(self) -> CdsFhirData:
|
65 |
| - self.data_generator.generate() |
66 |
| - return self.data_generator.data |
| 120 | + return self.data_generator.generate() |
67 | 121 |
|
68 | 122 | # Define your application logic here
|
69 | 123 | @hc.api
|
70 |
| - def my_service(self, request: CdsRequest) -> List[Card]: |
71 |
| - result = "Hello " + request["patient_name"] |
72 |
| - return result |
73 |
| - |
74 |
| -if __name__ == "__main__": |
75 |
| - cds = myCDS() |
76 |
| - cds.start_sandbox() |
77 |
| -``` |
78 |
| - |
79 |
| -Then run: |
80 |
| -```bash |
81 |
| -healthchain run mycds.py |
| 124 | + def my_service(self, data: CDSRequest) -> List[Card]: |
| 125 | + result = self.pipeline(data) |
| 126 | + return [ |
| 127 | + Card( |
| 128 | + summary="Welcome to our Clinical Decision Support service.", |
| 129 | + detail=result.summary, |
| 130 | + indicator="info" |
| 131 | + ) |
| 132 | + ] |
82 | 133 | ```
|
83 |
| -This will populate your EHR client with the data generation method you have defined, send requests to your server for processing, and save the data in `./output` by default. |
84 | 134 |
|
85 |
| -## Clinical Documentation |
| 135 | +### Clinical Documentation |
86 | 136 |
|
87 |
| -The ClinicalDocumentation use case implements a real-time Clinical Documentation Improvement (CDI) service. It helps convert free-text medical documentation into coded information that can be used for billing, quality reporting, and clinical decision support. |
| 137 | +The `ClinicalDocumentation` use case implements a real-time Clinical Documentation Improvement (CDI) service. It helps convert free-text medical documentation into coded information that can be used for billing, quality reporting, and clinical decision support. |
88 | 138 |
|
89 | 139 | **When is this used?** Triggered when a clinician opts in to a CDI functionality (e.g. Epic NoteReader) and signs or pends a note after writing it.
|
90 | 140 |
|
91 |
| -**What information is sent**: A [CDA (Clinical Document Architecture)](https://www.hl7.org/implement/standards/product_brief.cfm?product_id=7) document which contains continuity of care data and free-text data, e.g. a patient's problem list and the progress note that the clinician has entered in the EHR. |
92 |
| - |
93 |
| -**What information is returned**: A CDA document which contains additional structured data extracted and returned by your CDI service. |
| 141 | +**What information is sent**: A [CDA (Clinical Document Architecture)](https://www.hl7.org.uk/standards/hl7-standards/cda-clinical-document-architecture/) document which contains continuity of care data and free-text data, e.g. a patient's problem list and the progress note that the clinician has entered in the EHR. |
94 | 142 |
|
95 | 143 | ```python
|
96 | 144 | import healthchain as hc
|
97 | 145 |
|
| 146 | +from healthchain.pipeline import MedicalCodingPipeline |
98 | 147 | from healthchain.use_cases import ClinicalDocumentation
|
99 | 148 | from healthchain.models import CcdData, ProblemConcept, Quantity,
|
100 | 149 |
|
101 | 150 | @hc.sandbox
|
102 | 151 | class NotereaderSandbox(ClinicalDocumentation):
|
103 | 152 | def __init__(self):
|
104 |
| - self.cda_path = "./resources/uclh_cda.xml" |
| 153 | + self.pipeline = MedicalCodingPipeline.load("./path/to/model") |
105 | 154 |
|
106 | 155 | # Load an existing CDA file
|
107 | 156 | @hc.ehr(workflow="sign-note-inpatient")
|
108 | 157 | def load_data_in_client(self) -> CcdData:
|
109 |
| - with open(self.cda_path, "r") as file: |
| 158 | + with open("/path/to/cda/data.xml", "r") as file: |
110 | 159 | xml_string = file.read()
|
111 | 160 |
|
112 | 161 | return CcdData(cda_xml=xml_string)
|
113 | 162 |
|
114 |
| - # Define application logic |
115 | 163 | @hc.api
|
116 | 164 | def my_service(self, ccd_data: CcdData) -> CcdData:
|
117 |
| - # Apply method from ccd_data.note and access existing entries from ccd.problems |
118 |
| - |
119 |
| - new_problem = ProblemConcept( |
120 |
| - code="38341003", |
121 |
| - code_system="2.16.840.1.113883.6.96", |
122 |
| - code_system_name="SNOMED CT", |
123 |
| - display_name="Hypertension", |
124 |
| - ) |
125 |
| - ccd_data.problems.append(new_problem) |
126 |
| - return ccd_data |
| 165 | + annotated_ccd = self.pipeline(ccd_data) |
| 166 | + return annotated_ccd |
127 | 167 | ```
|
| 168 | +### Running a sandbox |
128 | 169 |
|
| 170 | +Ensure you run the following commands in your `mycds.py` file: |
129 | 171 |
|
130 |
| -### Streamlit dashboard |
131 |
| -Note this is currently not meant to be a frontend to the EHR client, so you will have to run it separately from the sandbox application. |
| 172 | +```python |
| 173 | +cds = MyCDS() |
| 174 | +cds.run_sandbox() |
| 175 | +``` |
| 176 | +This will populate your EHR client with the data generation method you have defined, send requests to your server for processing, and save the data in the `./output` directory. |
132 | 177 |
|
| 178 | +Then run: |
133 | 179 | ```bash
|
134 |
| -pip install streamlit |
135 |
| -streamlit streamlit-demo/app.py |
| 180 | +healthchain run mycds.py |
136 | 181 | ```
|
137 |
| - |
| 182 | +By default, the server runs at `http://127.0.0.1:8000`, and you can interact with the exposed endpoints at `/docs`. |
138 | 183 | ## Road Map
|
139 |
| -- [x] 📝 Adding Clinical Documentation use case |
140 |
| -- [ ] 🎛️ Version and test different EHR backend configurations |
141 |
| -- [ ] 🤖 Integrations with popular LLM and NLP libraries |
142 |
| -- [ ] ❓ Evaluation framework for pipelines and use cases |
| 184 | +- [ ] 🎛️ Versioning and artifact management for pipelines sandbox EHR configurations |
| 185 | +- [ ] 🤖 Integrations with other pipeline libraries such as spaCy, HuggingFace, LangChain etc. |
| 186 | +- [ ] ❓ Testing and evaluation framework for pipelines and use cases |
| 187 | +- [ ] 🧠 Multi-modal pipelines that that have built-in NLP to utilize unstructured data |
143 | 188 | - [ ] ✨ Improvements to synthetic data generator methods
|
144 |
| -- [ ] 👾 Frontend demo for EHR client |
| 189 | +- [ ] 👾 Frontend UI for EHR client and visualization features |
145 | 190 | - [ ] 🚀 Production deployment options
|
146 | 191 |
|
147 | 192 | ## Contribute
|
|
0 commit comments