Beyond the Warehouse: Why BigQuery Alone Won’t Solve Your Data Problems

Disclaimer: This summary has been generated by AI. It is experimental, and feedback is welcomed. Please reach out to info@qconlondon.com with any comments or concerns.

The presentation titled Beyond the Warehouse: Why BigQuery Alone Won’t Solve Your Data Problems by Sarah Usher focuses on the limitations of relying solely on BigQuery or any data warehouse to solve extensive data challenges. Sarah emphasizes the importance of a comprehensive data strategy and architecture to truly harness data's potential.

Key Points:

  • Limitations of Data Warehouses: Data warehouses like BigQuery can effectively manage data for a time but may struggle with latency and scalability as data complexity and sources increase. This can lead organizations to incorrectly view a warehouse as a solution to all data problems.
  • Data Strategy Essentials: A robust data strategy involves storing and curating raw data before it enters the warehouse. This enables flexibility, allowing for changes without compromising the data's integrity.
  • Importance of Raw Data Storage: Storing raw data in its original form is crucial to allow for reprocessing and adapting to new architectures or data tools.
  • Data Curation and Usage: Curated data should be maintained alongside raw data, providing a clean, standardized version for specific use cases and preventing reprocessing burdens.
  • Source of Truth Concept: Instead of relying on the warehouse as the single source of truth, the presentation suggests designing a more flexible approach whereby the source of truth is determined at a curated layer, allowing for diverse implementations.

Sarah concludes by encouraging organizations to store raw data diligently and rethink their data architecture beyond just the warehouse, fostering innovation and sustainability.

This is the end of the AI-generated content.


Many organizations mistake the adoption of a data warehouse, like BigQuery, as the golden ticket to solving all their data challenges. But without a robust data strategy and architecture, you’re simply shifting chaos into the cloud. This talk explores why a warehouse is just one piece of the puzzle and dives into the tools, processes, and structures that can enhance your architecture beyond the warehouse. We’ll also cover practical strategies for recovering from poorly implemented data systems and building sustainable and adaptable data infrastructure.

Interview:

What is the focus of your work?

I specialise in data software engineering and architecture, that enables effective data management to meet business needs. My work often involves designing and building systems that can handle increasing complexity - whether due to data volume, multiple sources, various integrations, or innovative applications of data.

What’s the motivation for your talk?

I've been discussing this topic for years because I've seen firsthand how poor data architecture can hinder companies. My goal is to help change that by sharing insights and best practices to change the status quo in data tech.

Who is your talk for?

My talk is for architects, engineers, and anyone responsible for shaping how data flows within an organisation.

What do you want someone to walk away with from your presentation?

A clear understanding of what good data architecture looks like, practical ideas for improving their own systems, and the realisation that they don’t have to accept the status quo. Every data system can be improved.

What do you think is the next big disruption in software?

I think we'll see an expansion of AI in automation. Whether one agrees with this direction or not, it will demand higher-quality data and more robust security practices than we currently have today.


Speaker

Sarah Usher

Data & Backend Engineer, Community Director, Mentor

Speaker bio: Sarah is a software engineer specialising in data engineering, backend systems, and scalable system design. She has extensive experience across industries such as banking, insurance, developer security, and digital advertising. Sarah excels in tackling challenges of scale - not just in terms of load or data size, but also data complexity. In addition to her technical work, Sarah is an active contributor to the tech community, regularly running talks, workshops, and training sessions through initiatives like Tech Risers Women, Women in Data, and Ladies of Code. She has won awards for her mentorship and leadership.

Read more
Find Sarah Usher at:

Date

Wednesday Apr 9 / 03:55PM BST ( 50 minutes )

Location

Whittle (3rd Fl.)

Topics

Data Architecture System Design scalability

Share

From the same track

Session Data Architecture

Reliable Data Flows and Scalable Platforms: Tackling Key Data Challenges

Wednesday Apr 9 / 10:35AM BST

There are a few common and mostly well-known challenges when architecting for data. For example, many data teams struggle to move data in a stable and reliable way from operational systems to analytics systems.

Speaker image - Matthias Niehoff

Matthias Niehoff

Head of Data and Data Architecture @codecentric AG, iSAQB Certified Professional for Software Architecture

Session AI/ML

Achieving Precision in AI: Retrieving the Right Data Using AI Agents

Wednesday Apr 9 / 11:45AM BST

In the race to harness the power of generative AI, organizations are discovering a hidden challenge: precision.

Speaker image - Adi Polak

Adi Polak

Director, Advocacy and Developer Experience Engineering @Confluent, Author of "Scaling Machine Learning with Spark" and "High Performance Spark 2nd Edition"

Session AI/ML

The Data Backbone of LLM Systems

Wednesday Apr 9 / 02:45PM BST

Any LLM application has four dimensions you must carefully engineer: the code, data, models and prompts. Each dimension influences the other. That's why you must learn how to track and manage each. The trick is that every dimension has particularities requiring unique strategies and tooling.

Speaker image - Paul Iusztin

Paul Iusztin

Senior ML/AI Engineer, MLOps, Founder @Decoding ML

Session

Panel: Modern Data Architectures

Wednesday Apr 9 / 01:35PM BST