Disclaimer: This summary has been generated by AI. It is experimental, and feedback is welcomed. Please reach out to info@qconlondon.com with any comments or concerns.
The presentation titled "Timeouts, Retries, and Idempotency in Distributed Systems" by Sam Newman provides insights into handling these fundamental concepts effectively to improve the resilience of distributed systems.
Introduction
Sam Newman, a technologist with expertise in cloud, microservices, and continuous delivery, explores the basics of distributed systems crucial for developers. The talk addresses timeouts, retries, and idempotency as core concepts.
Golden Rules of Distributed Systems
- You cannot instantly beam information between two points.
- You may not always be able to reach the desired service.
- Resources are not infinite.
Timeouts
- Timeouts are used to terminate requests after a certain threshold to manage resources effectively.
- Setting appropriate timeout durations is crucial to prevent resource overconsumption or premature termination.
Retries
- Retries are essential for handling transient failures in distributed systems.
- Randomizing retry intervals (jitter) helps prevent synchronized retries that can lead to system overloads.
- Managing retry logic ensures that retries do not compound issues.
Idempotency
- Idempotency ensures that multiple identical requests result in the same outcome, preventing erroneous repeated operations.
- Using unique request IDs is recommended to track and manage operations without altering final results.
Conclusion
The talk emphasizes understanding and implementing these concepts from the start to establish resilient distributed systems. Sam Newman encourages developers to consider the presented strategies to enhance system reliability.
This is the end of the AI-generated content.
The definition of insanity is doing the same thing over and over again” - this quote attributed to Einstein warns us of the danger of magical thinking, hoping that trying something just one more time will achieve success when before we failed. But is this really insanity?
In this talk, I’ll argue that retrying things actually does make a lot of sense, and is in fact key to improving the resilience of a distributed system. Along the way, I’ll explain the importance of timeouts, retry limits and knowing when giving up does make sense. I’ll also show how retries can be made safe (and help avoid draining your bank account), and perhaps we’ll get to examine that Einstein quote in a bit more detail.
Speaker

Sam Newman
Microservice, Cloud, CI/CD Expert, Author of "Building Microservices" and "Monolith to Microservices", 20+ Years Experience as a Developer
Sam is a technologist focusing in the areas of cloud, microservices, and continuous delivery - three topics which seem to overlap frequently. Providing consulting, training and advisory services to startups and large multi-national enterprises alike, he has over 20 years in IT as a developer, sys admin and architect. Sam is also author of the best selling Building Microservices and the forthcoming Building Resilient Distributed Systems, both from O’Reilly, and is an experienced conference speaker.