Chronon - Mixed-Workload Data Processing Framework

Abstract

Chronon is a data processing framework open-sourced by Airbnb. It is adopted across organizations like Stripe, Netflix, OpenAI, and Uber. Chronon was originally built for ML applications. It has since been adopted to power a variety of use-cases—heuristics for rule engines, context for LLMs, user-facing and business-facing metrics.

Chronon is adopted for its ability to generate training data at scale and serve features with very low latency with a simple, high-level API. It abstracts away the effort required to manually build batch and stream processing pipelines, indexes, and services.

This talk will focus largely on algorithms and optimizations inside Chronon. We will only briefly touch upon the core concepts of the API and a couple of example use-cases.


Speaker

Nikhil Simha

Co-Founder & CTO @zipline.ai, Author of "Chronon Feature Platform", Previously @Airbnb, @Meta, and @Walmartlabs

Nikhil Simha Raprolu is the Co-founder & CTO at zipline.ai. Prior to that he worked on the ML Infra team at Airbnb - where he open-sourced Chronon. At Facebook he worked on stream processing systems, schedulers and compilers - eg., Stylus & Turbine. Prior to that he worked on distributed data processing infrastructure at Amazon and Walmart Labs.

Read more
Find Nikhil Simha at:

From the same track

Session Kafka

Introducing Tansu.io -- Rethinking Kafka for Lean Operations

Tuesday Mar 17 / 01:35PM GMT

What if Kafka brokers were ephemeral, stateless and leaderless with durability delegated to a pluggable storage layer?

Speaker image - Peter Morgan

Peter Morgan

Founder @tansu.io

Session Machine Learning Infrastructure

From S3 to GPU in One Copy: Rethinking Data Loading for ML Training

Tuesday Mar 17 / 11:45AM GMT

ML training pipelines treat data as static. Teams spend weeks preprocessing datasets into WebDataset or TFRecords, and when they want to experiment with curriculum learning or data mixing, they reprocess everything from scratch.

Speaker image - Onur Satici

Onur Satici

Staff Engineer @SpiralDB & a Core Maintainer of Vortex (LF AI & Data), Previously Building Distributed Systems @Palantir

Session streaming

The Rise of the Streamhouse: Idea, Trade-Offs, and Evolution

Tuesday Mar 17 / 03:55PM GMT

Over the last decade, streaming architectures have largely been built around topic-centric primitives—logs, streams, and event pipelines—then stitched together with databases, caches, OLAP engines, and (increasingly) new serving systems.

Speaker image - Giannis Polyzos

Giannis Polyzos

Principal Streaming Architect @Ververica

Speaker image - Anton Borisov

Anton Borisov

Principal Data Architect @Fresha

Session Generative AI

Ontology‐Driven Observability: Building the E2E Knowledge Graph at Netflix Scale

Tuesday Mar 17 / 10:35AM GMT

As Netflix scales hundreds of client platforms, microservices, and infrastructure components, correlating user experience with system performance has become a hard data problem, not just an observability one.

Speaker image - Prasanna Vijayanathan

Prasanna Vijayanathan

Engineer @Netflix

Speaker image - Renzo  Sanchez-Silva

Renzo Sanchez-Silva

Engineer @Netflix

Session

Connecting the Dots: Modern Data Engineering & Architectures (Limited Space - Registration Required)

Tuesday Mar 17 / 05:05PM GMT