Skip to content

Latest commit

 

History

History
19 lines (13 loc) · 1.88 KB

README.md

File metadata and controls

19 lines (13 loc) · 1.88 KB

Real-time patient data collection and analysis using the Apache Kafka platform

Read my thesis here

Abstract

Apache Kafka is a platform designed for processing large volumes of data and data streams. This thesis examined the architecture of Apache Kafka, the principles of event streaming platforms, and the benefits and challenges of realtime analytics in modern society. Additionally, the aim was to develop a prototype that enables real-time collection and analysis of patient data using the Apache Kafka platform.

The prototype was implemented by simulating patient data with the Python programming language, sending the data to Apache Kafka, and storing it in an InfluxDB database. The visualization of patient data was carried out using the Grafana visualization tool, enabling real-time analysis. The system’s performance and scalability were evaluated by simulating patient data at different frequencies.

The results demonstrated that Apache Kafka can efficiently process large amounts of data, and a similar system could be utilized in healthcare. Future development should consider, for example, GDPR-compliant security requirements to enable the adoption of the prototype in healthcare applications.

Prototype

In this thesis I developed a prototype using Apache Kafka for real-time patient data collection and analysis. The prototype operates as follows:

  1. producer.py simulates patient data every 5 seconds and sends it to Apache Kafka.
  2. Apache Kafka receives the data and stores it to potilastieto-events topic.
  3. consumer.py reads the data from the topic and sends it to InfluxDB database as a Point-object.
  4. InfluxDB is integrated with Grafana allowing the data to be seamlessly visualized on a real-time dashboard. prototype flowchart