Real-time patient data collection and analysis using the Apache Kafka platform

Abstract

Apache Kafka is a platform designed for processing large volumes of data and data streams. This thesis examined the architecture of Apache Kafka, the principles of event streaming platforms, and the benefits and challenges of realtime analytics in modern society. Additionally, the aim was to develop a prototype that enables real-time collection and analysis of patient data using the Apache Kafka platform.

The prototype was implemented by simulating patient data with the Python programming language, sending the data to Apache Kafka, and storing it in an InfluxDB database. The visualization of patient data was carried out using the Grafana visualization tool, enabling real-time analysis. The system’s performance and scalability were evaluated by simulating patient data at different frequencies.

The results demonstrated that Apache Kafka can efficiently process large amounts of data, and a similar system could be utilized in healthcare. Future development should consider, for example, GDPR-compliant security requirements to enable the adoption of the prototype in healthcare applications.

Prototype

In this thesis I developed a prototype using Apache Kafka for real-time patient data collection and analysis. The prototype operates as follows:

producer.py simulates patient data every 5 seconds and sends it to Apache Kafka.
Apache Kafka receives the data and stores it to potilastieto-events topic.
consumer.py reads the data from the topic and sends it to InfluxDB database as a Point-object.
InfluxDB is integrated with Grafana allowing the data to be seamlessly visualized on a real-time dashboard.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!