Challenges in data engineering: Secrets on building a real-time streaming platform from Celonis’s head of streaming data and CEO of Lenses.io

https://delivery-p141552-e1488202.adobeaemcloud.com/adobe/assets/urn:aaid:aem:079db562-b93b-4f53-b022-063901ce89c8/as/Realtime_Connectors_Blog_Banner.png

Building a system that lets companies map, visualize and optimize their business processes in real-time is a tough technical challenge.

Celonis engineers are embracing this challenge and creating a whole new “operating system” for the enterprise, said Antonios Chalkiopoulos, VP of Streaming Data, Engineering, Action and Ecosystem at Celonis. And to do it, they’ve built a data streaming platform that can ingest data on millions of “process instances” as they flow through dozens of systems (IT, ERP, CRM, desktops, etc.) in real time.

https://delivery-p141552-e1488202.adobeaemcloud.com/adobe/assets/urn:aaid:aem:cff976b9-482f-45a9-8e60-1d01a927da7d/as/antonios_chalkiopoulos.jpeg

Antonios Chalkiopoulos, VP of Streaming Data, Engineering, Action and Ecosystem at Celonis and CEO and co-founder of Lenses.io is a former developer and data engineer with degrees from Oxford and the University of Hull. He is also co-founder and CEO of Lenses.io, which Celonis acquired in October 2021. Chalkiopoulos and the other Lenses.io co-founders, Stefan Bocutiu, Andrew Stevenson and Christina Daskalaki, created Lenses to help engineers and developers access, move and transform data in Apache Kafka. The four founders are also some of the biggest contributors to the Kafka and Kubernetes open-source ecosystems.

I caught up with Chalkiopoulos right before Celonis World Tour 2022, to talk about how real-time data ingestion works within the Celonis EMS (execution management system) and why the Celonis engineering team made the decision to move from batch processing to streaming data. We also discussed the technical hurdles the team had to overcome, and he offered advice for engineers and developers working on similar technical challenges.

Why was the decision made to move from batch data ingestion to real-time streaming?

Chalkiopoulos: Celonis is the operating system of the enterprise. What this means is that the execution management system watches every instance of every process as it gets created and makes its way through and steers it towards the right outcomes - whether that requires human judgment or automated adjustments. Timely action requires access to process trajectory in real-time. Streaming is the natural way to achieve this, as avoiding accumulating data into batches enhances both technical performance, but also the user experience; and hence our emphasis on streaming at the core of EMS infrastructure.

What were some of the technical hurdles you experienced and how did you overcome them?

Chalkiopoulos: A typical medium-sized enterprise has 10s of 1000s of processes and 1000s of instances of these processes created every day. That’s over a million process instances a day. Each process instance navigates 100s of steps towards realizing outcomes. Managing the sheer data volumes, the heterogeneity of enterprise in the form of the many source systems in which these processes are encoded, and dealing with resilience and recovery from failures are at the heart of some of the technical challenges we have had to overcome.

What advice would you give to other engineering and dev teams on how to build streaming systems that scale?

Chalkiopoulos: Reliability and scale are at the core of the stream enabling your data flows. Making these systems reliable, instantly connected, elastic and robust requires extreme attention to detail and is very rewarding to see them in action.