How to Build a Database Without a Server

Modern data analytics workflows rely on scaling out to huge numbers of users and compute nodes. Managing database installations to handle this scale can be unsustainably complex and expensive. Is it instead possible to get rid of all this complexity and build a database with just a client-side library and object storage?

At Man Group we have evolved over time from managing one of the largest MongoDB installations in Europe to a serverless model where users interact directly with object storage using ArcticDB, our own database engine. What we've learnt from this should be interesting to anyone interested in distributed computing, not just database development.

We will focus on topics such as:

  • Our choices around ACID and our core data structures
  • How to manage global state with lock-free techniques such as CRDTs
  • How we manage to work with relatively high latency commodity object storage
  • How object storage has evolved over time, and how advanced it is becoming

Interview:

What is the focus of your work?

I work on ArcticDB, a client-side database engine optimised for timeseries data that's been developed from scratch in Man Group and now its own business. A lot of my work on the project has been on an optional set of server side processes to manage data replication and streaming data ingestion.

What’s the motivation for your talk?

ArcticDB has an unusual serverless architecture where users use our library to interact directly with object storage, with no co-ordinating server. We've learnt a lot about distributed computing and working with object storage by building this, and I want to share some interesting techniques and design choices that we've used.

Who is your talk for?

The concepts in my talk should be interesting to anyone working on distributed computing problems, whether for database development or not. It should be particularly interesting to people who work with object storage like S3. We will discuss specific design choices we've made and why, especially around our data structures and data format, so a good audience would be senior developers and technical architects who make similar decisions in their own projects.

What do you want someone to walk away with from your presentation?

New ideas about how to make useful software in a serverless architecture and an appreciation of how powerful modern object storage technologies are.

What do you think is the next big disruption in software?

It might not be as big a disruption as AI but it will be interesting to see how columnar file formats evolve and whether a successor to Parquet as a de facto standard will emerge.


Speaker

Alex Seaton

Staff Engineer @ArcticDB, Previously Working on Quant Trading Systems @Man Group

Alex Seaton is an engineer working on ArcticDB at Man Group. ArcticDB is a high-performance dataframe database that is optimised for timeseries data, data-science workflows and scales to petabytes of data and thousands of simultaneous users. At ArcticDB his focus has been on data replication and tick streaming infrastructure. Before joining ArcticDB, Alex built trade execution and market data systems.

Read more
Find Alex Seaton at:

Date

Wednesday Apr 9 / 10:35AM BST ( 50 minutes )

Location

Churchill (Ground Fl.)

Topics

database Platform Engineering Serverless CRDT System Design

Share

From the same track

Session

Fighting Financial Crime with AI

Wednesday Apr 9 / 01:35PM BST

Details coming soon.

Session Platform Engineering

Unleashing Kubernetes for Secure Bare-Metal Workloads

Wednesday Apr 9 / 11:45AM BST

Kubernetes is great for general cloud-native workloads but struggles with low latency and high-performance computing (HPC) due to its abstraction overhead, lack of optimized scheduling, and network inefficiencies.

Speaker image - Ruslan Kharitonov

Ruslan Kharitonov

Head of Compute @Citadel, Founder @Efacity, Previously Technology Fellow and Global Head of Network and Storage Software Engineering @GoldmanSachs, 25+ Years in Software and System Engineering

Session

Navigating the Regulatory Landscape in FinTech

Wednesday Apr 9 / 02:45PM BST

Details coming soon.

Speaker image - Sérgio Amorim

Sérgio Amorim

DevOps Engineer @Revolut, DevOps Lisbon Meetup Organizer

Session architecture

Latency: The Race to Zero...Are We There Yet?

Wednesday Apr 9 / 03:55PM BST

Low and predictable latency have been an edge in financial trading. Aeron has been pushing the limit on what is possible for IPC, on-premise, and in the cloud messaging. Can we do better?

Speaker image - Amir Langer

Amir Langer

Principal Software Engineer @Adaptive Financial Consulting