Corvus.Retry
This provides support for retriable and reliable/long running methods. This was written before https://github.com/App-vNext/Polly was invented. For new code, we suggest you look at Polly.
It is built for netstandard2.0.
Features
There are three types of retry operation in this library:
RetryTask
Starting a task to execute a function asynchronously, with retry
Retriable
Wrapping a synchronous or asynchronous operation with retry semantics
ReliableTaskRunner
Wrapping a long-running asynchronous operation in a host that ensures it continues to run even after a transient failure.
Usage
Retriable
This is the most common usage pattern. It wraps an existing method, and allows you to re-execute the method if it throws an exception.
Retriable.Retry(() => DoSomethingThatMightFail());
You can return values from your method
SomeResult result = Retriable.Retry(() => new SomeResult());
You can also use asynchronous methods
Task resultTask = Retriable.RetryAsync(() => SomeOperationAsync());
Policy and Strategy
Two types help you to control when a failed operation is retried, and how that retry occurs. IRetryPolicy
and IRetryStrategy
.
IRetryPolicy
The retry policy gives you the ability to determine whether the operation should be retried or not, based on the exception that has been thrown.
This typically means you can distinguish between terminal exceptions (like bad input) and transient exceptions (like a network timeout).
There are three built-in retry policies:
AnyException
This will always retry on any exception, and is the default for Retriable
.
DoNotRetryPolicy
This will never retry, regardless of the exception. You use this to disable retry, without having to comment out your retry code.
Typically, you would do this with some kind of conditional
IRetryPolicy policy = isDebuggingOrWhatever ? new DoNotRetryPolicy() : new AnyExceptionPolicy();
var result = Retriable.Retry(() => DoSomething(), new Count(10), policy);
AggregatePolicy
This gives you a means of ANDing together multiple policies. The AggregatePolicy
only succeeds if ALL of its children succeed.
var aggregatePolicy =
new AggregatePolicy
{
Policies =
{
new CustomPolicy1(),
new CustomPolicy2(),
},
}
Implementing a custom policy
It is very simple to create your own custom policy, and you will frequently do so. You implement the IRetryPolicy
interface, and its bool CanRetry(Exception exception);
method.
For example, let's imagine we were consuming an HttpService which occasionally gives us a 429 - Too Many Requests
error in an HttpServiceException
.
We can implement a policy which will only retry if we recieve this exception.
public class RetryOnTooManyRequestsPolicy : IHttpPolicy
{
bool CanRetry(Exception exception)
{
return (exception is HttpServiceException httpException && httpException.StatusCode == 429);
}
}
IRetryStrategy
The IRetryStrategy
controls the way in which the operation is retried. It controls both the delay between each retry and the number of times that it will be retried.
There are several strategies provided out-of-the-box
Count
This simply retries a specified number of times, with no delay. Retriable
operations default to a new Count(10)
strategy.
DoNotRetry
This is the strategy equivalent of the DoNotRetryPolicy
. It forces a retry to be abandoned, regardless of policy.
Linear
This retries a specified number of times, with an constant delay between each retry.
For example, new Linear(TimeSpan.FromSeconds(1), 10)
will retry up to 10 times. Each time it will delay by 1s. So the first retry will be after 1s (wall clock time 1s), the second after another 1s (wall clock time 2s), the third after another 1s (wall clock time 3s).
Incremental
This retries a specified number of times, with an arithmetically increasing delay between each retry.
For example, new Incremental(TimeSpan.FromSeconds(1), 10)
will retry up to 10 times. Each time it will increase the delay by 1s. So the first retry will be after 1s (wall clock time 1s), the second after another 2s (wall clock time 3s), the third after another 3s (wall clock time 6s).
This allows a slowly increasing delay.
Backoff
This retries a specified number of times, with a delay that increases geometrically between each retry.
For example, new Backoff(10, TimeSpan.FromSeconds(1))
will retry up to 10 times. Each time it will increase the delay by a value calculated roughly like this: 2^n * (delta +/- a small random fudge)
.
This allows a rapidly increasing delay, with a bit of random jitter addded to avoid contention.
You can also set a MinBackoff
and MaxBackoff
to limit the lower and upper bounds of the delay.
Implementing a custom strategy
It is slightly more complex to implement a custom retry strategy than it was to implement a retry policy. Although you can directly implemet the IRetryStrategy
interface, it is usually simpler to derive from the abstract RetryStrategy
class.
In that case, you need to override the TimeSpan PrepareToRetry(Exception lastException)
method. You can examine the exception and use your internal state to determine two things:
- Should we record this exception, or filter it out? Typically, we will want to record the exception, and, if so, we call
this.AddException(lastException)
. - For how long should we delay before retrying? We return a
TimeSpan
from the method to determine this.
So, for example, Count
implements the method like this:
public override TimeSpan PrepareToRetry(Exception lastException)
{
if (lastException is null)
{
throw new ArgumentNullException(nameof(lastException));
}
this.AddException(lastException);
this.tryCount++;
return TimeSpan.Zero;
}
It updates its internal state to keep a count of the number of retries, adds the exception to the list of exceptions we have seen, and tells the strategy not to delay before retrying.
We also override the bool CanRetry { get; }
property to determine wether this strategy still allows us to retry.
The Count
strategy, for example, simply uses its internal state to determine whether to continue.
/// <inheritdoc/>
public override bool CanRetry
{
get
{
return this.tryCount < this.maxTries;
}
}
ISleepService
Retriable
uses an implementation of an ISleepService
to delay between retries. The sleep service offers synchronous Sleep(TimeSpan)
and asynchronous SleepAsync(TimeSpan)
methods to achieve a delay.
The default SleepService
implementation uses Thread.Sleep()
for the synchronous delay, and Task.Delay()
for the asynchronous implementation.
For test purposes, for example, you could create an ISleepService
implementation that incremented a virtual wallclock, and/or removed the delay entirely. You would set this using the Retriable.SleepServices
static property.
RetryTask
The RetryTask
is an analog of Task
which has built-in retry semantics.
Instead of calling Task.Factory.StartNew()
you can call RetryTask.Factory.StartNew()
with all the familiar parameters.
In addition to those usual parameters, you can pass an IRetryPolicy
and an IRetryStrategy
to control the retry behaviour. The defaults are the same as for Retriable
.
You can also control the sleep behaviour in the same was as Retriable
by setting an ISleepService
on
ReliableTaskRunner
This is used to run a service-like operation that is supposed to execute 'forever' (or, at least, until cancellation). If the method fails, it should be re-started.
You start a task using the static ReliableTaskRunner.Run()
method. For example
ReliableTaskRunner runner = ReliableTaskRunner.Run(cancellationToken => DoSomeOperationAsync(cancellationToken));
When you want to terminate the task, you call the StopAsync()
method e.g.
await runner.StopAsync();
As with the other retry methods, there are overloads where you can pass an IRetryPolicy
control the restart behaviour.
Licenses
Corvus.Retry is available under the Apache 2.0 open source license.
For any licensing questions, please email licensing@endjin.com
Project Sponsor
This project is sponsored by endjin, a UK based Microsoft Gold Partner for Cloud Platform, Data Platform, Data Analytics, DevOps, and a Power BI Partner.
For more information about our products and services, or for commercial support of this project, please contact us.
We produce two free weekly newsletters; Azure Weekly for all things about the Microsoft Azure Platform, and Power BI Weekly.
Keep up with everything that's going on at endjin via our blog, follow us on Twitter, or LinkedIn.
Our other Open Source projects can be found on GitHub
Code of conduct
This project has adopted a code of conduct adapted from the Contributor Covenant to clarify expected behavior in our community. This code of conduct has been adopted by many other projects. For more information see the Code of Conduct FAQ or contact hello@endjin.com with any additional questions or comments.
IP Maturity Matrix (IMM)
The IMM is endjin's IP quality framework.