Abstract
Over the last decade, streaming architectures have largely been built around topic-centric primitives—logs, streams, and event pipelines—then stitched together with databases, caches, OLAP engines, and (increasingly) new serving systems. This approach scales, but it also accumulates architectural debt: duplicated data, fractured "truth," inconsistent guarantees, and rising operational overhead as batch, streaming, and product analytics diverge.
In this talk we introduce an emerging shift we call the Streamhouse: a table-centric streaming architecture that treats tables as the primary primitive, and models "real-time" as freshness tiers rather than separate systems. Conceptually, it extends the lakehouse by making continuous ingestion + continuous maintenance the default, so one copy of data can serve both low-latency and historical workloads with straightforward SQL access.
Then we switch from idea to practice: we'll walk through how platforms evolve from batch refreshes -> "near-real-time" -> hot/warm/cold SLAs. We'll show a pattern we built first for analytics: an OLAP serving layer with federated SQL over tiered data. That solves the initial problem—until you notice the same shape repeating across other workloads that also want "fresh + queryable" canonical data, not just dashboards. The talk follows how this pushes teams to generalize from "OLAP over the lake" to "tiered access as a platform primitive," and what trade-offs you must get right: where state lives, how tier boundaries are defined, how continuous maintenance is paid for, and what consistency guarantees you can realistically promise.
The session is practical but forward-looking: a working mental model for tiered real-time lakehouse systems, plus reference architecture patterns and a decision framework for applying the Streamhouse idea as repeatable architecture rather than a custom stack.
Speaker
Giannis Polyzos
Principal Streaming Architect @Ververica
Giannis Polyzos is a Principal Streaming Architect working on large-scale data infrastructure and real-time systems. He has designed and operated streaming platforms used in production by high-scale organizations. He is a PPMC member of Apache Fluss and has been deeply involved in Apache Flink and the broader streaming ecosystem. His work focuses on unifying batch and streaming architectures, simplifying data primitives, and enabling streaming analytics and stateful workloads at scale.
Speaker
Anton Borisov
Principal Data Architect @Fresha
Anton Borisov is a Principal Data Architect building real-time data platforms for customer-facing analytics. His work spans zero-downtime Postgres migrations, CDC-driven streaming pipelines, and architectures that combine stream processing with open table formats and high-performance analytics engines. He’s a well-known voice in the streaming community, writing technical deep-dives on Apache Flink, Fluss, Iceberg, and StarRocks, with a focus on turning cutting-edge ideas into reliable production systems.