Implementing exponential backoff
This page explains how to use truncated exponential backoff to ensure your devices do not generate excessive load.
When devices retry calls without waiting, they can produce a heavy load on the ClearBlade IoT Core servers. ClearBlade IoT Core automatically limits projects that generate excessive load. Even a small fraction of overactive devices can trigger limits that affect all devices in the same Google Cloud project.
You are strongly encouraged to implement truncated exponential backoff with introduced jitter to avoid triggering these limits. If you have questions or would like to discuss your algorithm’s specifics, email iotcore@clearblade.com with this information:
- IoT Core registry name 
- Number of devices connected 
- Industry 
Truncated exponential backoff is a standard error-handling strategy for network applications. Clients will periodically retry a failed request with increasing delays between requests. Clients should use truncated exponential backoff for all requests to ClearBlade IoT Core that return HTTP 5xx and 429 response codes and disconnections from the MQTT server.
Example algorithm
An exponential backoff algorithm retries requests exponentially, increasing the waiting time between retries up to a maximum backoff time. For example:
- Make a request to ClearBlade IoT Core. 
- If the request fails, wait 1 + - random_number_millisecondsseconds and retry the request.
- If the request fails, wait 2 + - random_number_millisecondsseconds and retry the request.
- If the request fails, wait 4 + - random_number_millisecondsseconds and retry the request.
- And so on, up to a - maximum_backofftime.
- Continue waiting and retrying up to some maximum number of retries, but do not increase the waiting period between retries. 
where:
- The wait time is - min(((2^n)+random_number_milliseconds), maximum_backoff), with- nincremented by 1 for each iteration (request).
- random_number_millisecondsis a random number of milliseconds less than or equal to 1000. This helps to avoid cases in which some situation synchronizes many clients, and all retry at once, sending requests in synchronized waves. The- random_number_millisecondsvalue is recalculated after each retry request.
- maximum_backoffis typically 32 or 64 seconds. The appropriate value depends on the use case.
The client can continue retrying after it has reached the maximum_backoff time. Retries after this point do not need to continue increasing backoff time. For example, suppose a client uses a maximum_backoff time of 64 seconds. After reaching this value, the client can retry every 64 seconds. At some point, clients should be prevented from retrying indefinitely.
The wait time between retries and the number of retries depends on your use case and network conditions.
