Understanding Retries, Exponential Backoffs, and Circuit Breakers in Distributed Systems

Dilanka Muthukumarana
3 min readAug 10, 2024

--

Architecting a system is all about balancing trade-offs at a point in time but maintaining extendability.

In distributed systems, ensuring reliability and resilience is important. Three powerful patterns to help with this are Retries, Exponential Backoffs, and Circuit Breakers. Let’s break them down!

1. 🔄 Retries

When a request fails, trying again before giving up completely is often useful. This is known as a retry.

Example

Imagine you are trying to call an API to get data. If the API call fails, you try again a few more times before deciding it’s really down.

int retryCount = 3;
for (int i = 0; i < retryCount; i++)
{
try
{
var response = CallApi();
if (response.IsSuccessStatusCode)
{
// Process the response
break;
}
}
catch (Exception ex)
{
// Log the exception
}
}

2. 📈 Exponential Backoffs

Simply retrying immediately is not always the best strategy. Exponential backoff is a technique where each retry waits longer than the previous one, reducing the load on the system and giving it time to recover.

Example

If the first retry waits for 1 second, the next might wait for 2 seconds, then 4 seconds, and so on.

int maxRetries = 5;
int delay = 1000; // initial delay in milliseconds

for (int i = 0; i < maxRetries; i++)
{
try
{
var response = CallApi();
if (response.IsSuccessStatusCode)
{
// Process the response
break;
}
}
catch (Exception ex)
{
// Log the exception
Thread.Sleep(delay);
delay *= 2; // Exponential backoff
}
}

3. 🔌 Circuit Breakers

Retries and exponential backoffs are great, but what if a service is really down?

Continuously retrying can overwhelm it further. Circuit breakers prevent this by cutting off requests to a failing service after a threshold is reached.

Example

A circuit breaker can “open” after a certain number of failures, and “close” after a period of successful requests.

public class CircuitBreaker
{
private int failureCount = 0;
private int successCount = 0;
private const int failureThreshold = 3;
private const int successThreshold = 5;
private bool circuitOpen = false;

public bool CallService(Func<bool> action)
{
if (circuitOpen)
{
Console.WriteLine("Circuit is open. Skipping call.");
return false;
}

try
{
if (action())
{
successCount++;
if (successCount >= successThreshold)
{
circuitOpen = false;
successCount = 0;
}
return true;
}
else
{
failureCount++;
if (failureCount >= failureThreshold)
{
circuitOpen = true;
failureCount = 0;
}
return false;
}
}
catch
{
failureCount++;
if (failureCount >= failureThreshold)
{
circuitOpen = true;
failureCount = 0;
}
return false;
}
}
}

🎯 Key Takeaways:

  • Retries — Give operations another chance to succeed.
  • Exponential Backoffs — Spread out retries to prevent overwhelming the system.
  • Circuit Breakers — Prevent continuous failures from bringing down the system.

Using these patterns together can significantly improve the resilience and reliability of your distributed systems.

💡 What strategies do you use to manage failures in your systems? Share your experiences!

If you enjoyed this article and found it insightful, please consider supporting it with some 👏 claps, sharing it 🔄, and following me on LinkedIn 🔗. I value your feedback and would love to hear your opinions and ideas 💡. Don’t hesitate to comment below with topics you’re interested in or thoughts you’d like to share 💬. Let’s keep the conversation going and explore together!

Are you looking for expert freelance services or professional consultation? Visit https://devinsights.tech/ for top-notch solutions tailored to your needs. Let’s turn your vision into reality!

--

--