There are cases, when processing a work queue command ends with failure and this is expected and requirement is to try it again later. At the moment it is solved by an extra queue with some sort of automation to put the failed entry back to the processing queue. While it is functional it also prevents implementation of round-robin queue processing.
Proposed solution:
Extension of queue - the type of retry policy (NoRetry, LinearRetry, ExponentialRetry), number of retries, retry time interval (LinearRetry time interval, ExponentialRetry time seed). Extending queue command would bring unnecessary complexity.
Extension of queue entry - number of times being processed, timestamp of the last attempt, timestamp of the expected next attempt (aka don’t process before).
Queue processor - adjustments in queue entry retrieval process and recording number of processing attempts.
There will be a breaking change in WorkQueueEntry entity behaviour.
The problem: Each work queue class has a data structure defined for work queue entries. This data structure lists all the data fields explicitly. The new functionality adds a new field that specifies number of retries to each work queue entry.
That means all existing work queue entry data structures have to be modified.
Solutions:
We migrate so when upgrading model we add these fields for all the data structures used inside work queue classes
We make the code backwards compatible – if the new field is not there we don’t complain unless retry functionality is turned on. Then we throw en error.
We will need an extra parameter for the exponential retry called ExponentialRetryBase.
Delay before each retry will be a random number between min and max values calculated with an exponential function. Like this:
min = (ExponentialRetryBase ^ (x - 1)) * RetryIntervalSeconds
max = (ExponentialRetryBase ^ x) * RetryIntervalSeconds
An example for ExponentialRetryBase=2, RetryIntervalSeconds=35:
retry
(a^x)*Interval
min
max
1
70
35
70
2
140
70
140
3
280
140
280
4
560
280
560
5
1120
560
1120
6
2240
1120
2240
7
4480
2240
4480
8
8960
4480
8960
9
17920
8960
17920
10
35840
17920
35840
this means that we need these fields in
WorkQueueEntry:
AttemptCount
int, non-nullable, default 0
LastAttemptTime
date, nullable
NextAttemptTime
date, nullable
InRetry
bool, non-nullable, default false
WorkQueue:
refWorkQueueRetryTypeId
Guid, non-nullable, default no retry