There are a few common and mostly well-known challenges when architecting for data. For example, many data teams struggle to move data in a stable and reliable way from operational systems to analytics systems. At the same time, they must manage complex and often costly infrastructure landscapes. These issues hinder companies from effectively leveraging their data for business purposes.
Drawing from real-world experience, this presentation will explore how we address these challenges by building reliable and scalable data platforms with reasonable costs. It will also cover solutions to help operational teams provide their data and to observe the flow towards analytics systems. In addition to discussing architectures and design considerations the presentation will also highlight tools and techniques used to implement these platforms.
Interview:
What is the focus of your work?
In one sentence, I would say: availability of data for analytical use, regardless of whether it is a dashboard, machine learning or GenAI. In addition to data platforms and infrastructures, this often includes the transfer of data from operational systems to analytical systems, which is often still treated rather neglected and offers many pitfalls.
What’s the motivation for your talk?
Many data initiatives fail to obtain reliable and consistent data. There are many different approaches to solving this problem, some of which are borrowed from well-known software engineering best practices. I would like to present a few of these solutions and, above all, show how we have implemented them in practice. Without the costs exploding.
Who is your talk for?
Of course, the talk is primarily aimed at data architects and data engineers. But I also invite all software engineers to take a look at the topic. They too are increasingly coming into contact with the supply of data and can make a big difference. Plus knowing about whats possible there could make their lives easier.
What do you want someone to walk away with from your presentation?
Ideas on how I can provide data better in the sense of more stable and correct. Tangible possibilities of which tools and processes can be used to implement this. Without sinking into complexity and ultimately costs.
What do you think is the next big disruption in software?
I believe that one of the big issues will be the reduction of complexity. Nowadays, modern IT landscapes are a patchwork of barely comprehensible building blocks that are somehow held together. In addition to maintenance, this also makes it difficult to test and implement new features in order to quickly adapt to the market. In this context, automation (e.g. through AI) will play a major role on both the business and technical side.
What was one interesting thing that you learned from a previous QCon?
I was already familiar with duckdb, but in 2023 I understood in Hannes Mühleisen's presentation what possibilities the technology offers and what diverse applications it enables. That was the first time I was able to imagine the changes in data architectures that are possible thanks to the new, lean processing frameworks.
Speaker

Matthias Niehoff
Head of Data and Data Architecture @codecentric AG, iSAQB Certified Professional for Software Architecture
Matthias Niehoff works as Head of Data and Data Architect for codecentric AG and supports customers in the design and implementation of data architectures. His focus is on the necessary infrastructure and organization to help data and ML projects succeed.