In the past I learned data can be processed in parallel or with transactional guarantees. Kafka showed me that there often is a pragmatic middle way. The idea is, does it really matter if patient A or patient B’s data is processed first? In many cases no, as long as all data belonging to a single patient is processed in the correct order.
Or in Kafka terms, the data is partitioned by the patient ID and thus all data within a partition is processed in order and transactional, parallel to other patients’ data.
That got me hooked on Apache Kafka. Its other concepts in regards to Load Balancing, KStreams, etc. are clever as well.