Abstract
High-demand events can cause sudden traffic spikes that overwhelm even well-designed systems. In ticketing platforms, millions of users — alongside increasingly sophisticated automated agents — may arrive simultaneously, placing extreme pressure on backend services.
At SeatGeek, we observed that even elastic infrastructure has limits: autoscaling takes time to react, and systems must survive while capacity catches up. To address this gap, we designed a layered shielding architecture that distributes defensive responsibilities across multiple parts of the platform.
At the edge, caching, shielding, and admission control mechanisms such as queueing absorb traffic bursts before they reach the origin. API gateways enforce fairness through rate limiting and request validation. Deeper in the stack, Kubernetes-native networking policies and platform controls help contain failures and protect service boundaries.
This layered approach allows the system to shed load early, protect critical services, and degrade gracefully during extreme demand. But resilience is not static: traffic patterns evolve, new bottlenecks emerge, and systems must continuously adapt through observability and feedback signals.
In this talk, we will explore the architecture and operational lessons behind building multi-layer shields that protect core systems under internet-scale traffic, and share practical insights for designing resilient platforms that can withstand traffic stampedes without bringing down the entire ecosystem.
Interview:
What is your session about, and why is it important for senior software developers?
This talk explores how to design resilient systems that can withstand extreme traffic spikes without collapsing. Using real-world examples from ticketing platforms, I will show how distributing defensive responsibilities across layers — edge, gateway, and platform infrastructure — helps protect critical services during sudden demand surges. Senior engineers often operate systems where scaling alone is not enough; resilience requires intentional architecture and operational controls. The session focuses on practical patterns that help systems degrade gracefully rather than fail catastrophically.
Why is it critical for software leaders to focus on this topic right now, as we head into 2026?
Traffic patterns are becoming less predictable as automated agents, AI-driven clients, and global user demand increase system pressure. At the same time, modern platforms rely on complex distributed architectures in which small failures can quickly cascade. Leaders need to design systems that assume sudden spikes and evolving traffic behavior. Building resilience through layered defenses and clear operational signals is becoming essential for maintaining reliability at scale.
What are the common challenges developers and architects face in this area?
A common misconception is that cloud elasticity alone solves scalability problems. In reality, autoscaling takes time to react, and systems often experience instability before capacity catches up. Teams also struggle to identify truly critical services, manage noisy-neighbor effects in shared infrastructure, and detect early signals of system stress. Designing architectures that shed load early and protect the core system requires coordination across multiple platform layers.
What's one thing you hope attendees will implement immediately after your talk?
I hope attendees rethink where traffic control happens in their systems. Instead of relying solely on backend scaling, they should introduce earlier defenses — such as caching, admission control, and rate limiting — to absorb pressure before it reaches core services. Even small changes at the edge or gateway layer can dramatically improve system stability during traffic spikes.
What makes QCon stand out as a conference for senior software professionals?
QCon focuses on real engineering experience rather than hype or vendor-driven content. Speakers share lessons learned from operating large-scale systems in production, including the trade-offs and failures behind architectural decisions. This creates an environment where senior engineers can learn from peers facing similar challenges. The emphasis on practical insight and honest technical discussion makes QCon particularly valuable.
Speaker
Anderson Parra
Staff Software Engineer @SeatGeek
Anderson Parra is a Staff Software Engineer on SeatGeek’s Cloud Platform team, where he works on the infrastructure that powers high-demand ticket onsales. His work focuses on building resilient systems that can withstand internet-scale traffic and on designing layered defenses across edge, API gateways, and Kubernetes platforms to protect core services while preserving a fair user experience.
Over the past 18+ years, Ander has built and operated large-scale distributed systems handling massive traffic and data volumes for companies in Brazil, Ireland, Germany, the UK, and the United States. He has worked across a wide range of technologies, including Java, Scala, Go, Ruby, Python, JavaScript, and Lua, with a strong focus on platform engineering and distributed systems.
Anderson holds a master’s degree in distributed systems based on his research, “A Lightweight Reconfiguration Solution for Paxos”.