|
Azure SDK for Embedded C
|
Device Provisioning and IoT Hub service protocols require additional state management on top of the MQTT protocol. The Azure IoT Hub and Provisioning clients for C provide a common programming model. The clients must be layered on top of an MQTT client selected by the application developer.
The following aspects are being handled by the IoT Clients:
The following aspects need to be handled by the application or convenience layers:
For more information about Azure IoT services using MQTT see this article.
In order to port the clients to a target platform the following items are required:
uint8_t must be defined.Optionally, the IoT services support MQTT tunneling over WebSocket Secure which allows bypassing firewalls where port 8883 is not open. Using WebSockets also allows usage of devices that must go through a WebProxy. Application developers are responsible with setting up the wss:// tunnel.
The application code is required to initialize the TLS and MQTT stacks. Detailed information about TLS over TCP/IP requirements can be found at https://docs.microsoft.com/azure/iot-hub/iot-hub-tls-support.
Two authentication schemes are currently supported: X509 Client Certificate Authentication and Shared Access Signature authentication.
When X509 client authentication is used, the MQTT password field should be an empty string.
If SAS tokens are used the following APIs provide a way to create as well as refresh the lifetime of the used token upon reconnect.
Example:
Recommended defaults:
AZ_IOT_DEFAULT_MQTT_CONNECT_KEEPALIVE_SECONDSWe recommend to always use Clean Session false when connecting to IoT Hub. Connecting with Clean Session true will remove all enqueued C2D messages.
Each service requiring a subscription implements a function similar to the following:
Example:
Note: If the MQTT stack allows, it is recommended to subscribe prior to connecting.
Each action (e.g. send telemetry, request twin) is represented by a separate public API. The application is responsible for filling in the MQTT payload with the format expected by the service.
Example:
Note: To limit overheads, when publishing, it is recommended to serialize as many MQTT messages within the same TLS record. This feature may not be available on all MQTT/TLS/Sockets stacks.
We recommend that the handling of incoming MQTT PUB messages is implemented by a chain-of-responsibility architecture. Each handler is passed the topic and will either accept and return a response, or pass it to the next handler.
Example:
Important: C2D messages are not enqueued until the device establishes the first MQTT session (connects for the first time to IoT Hub). The C2D message queue is preserved (according to the per-message time-to-live) as long as the device connects with Clean Session false.
Retrying operations requires understanding two aspects: error evaluation (did the operation fail, should the operation be retried) and retry timing (how long to delay before retrying the operation). The IoT client library is supplying optional APIs for error classification and retry timing.
The SDK will not handle protocol-level (WebSocket, MQTT, TLS or TCP) errors. The application-developer is expected to classify and handle errors the following way:
Both IoT Hub and Provisioning services will use MQTT CONNACK as described in Section 3.2.2.3 of the MQTT v3.1.1 specification.
Note: The Provisioning Service query polling operation may result in retriable errors. In some cases, the service response will not include an operation_id. In this case, the device may either reuse a cached operation_id or restart the flow from the register step.
APIs using az_iot_status report service-side errors to the client through the IoT protocols.
The following APIs may be used to determine if the status indicates an error and if the operation should be retried:
Network timeouts and the MQTT keep-alive interval should be configured considering tradeoffs between how fast network issues are detected vs traffic overheads. This document describes the recommended keep-alive timeouts as well as the minimum idle timeout supported by Azure IoT services.
For connectivity issues at all layers (TCP, TLS, MQTT) as well as cases where there is no retry-after sent by the service, we suggest using an exponential back-off with random jitter function. az_iot_retry_calc_delay is available in Azure IoT Common:
Note 1: The network stack may have used more time than the recommended delay before timing out. (e.g. The operation timed out after 2 minutes while the delay between operations is 1 second). In this case there is no need to delay the next operation.
Note 2: To determine the parameters of the exponential with back-off retry strategy, we recommend modeling the network characteristics (including failure-modes). Compare the results with defined SLAs for device connectivity (e.g. 1M devices must be connected in under 30 minutes) and with the available Azure IoT Hub scale and Azure Provisioning Service Scale (especially consider throttling, quotas and maximum requests/connects per second).
In the absence of modeling, we recommend the following default:
For service-level errors, the Provisioning Service is providing a retry-after (in seconds) parameter:
Combining the functions above we recommend the following flow:
When using Provisioning Service, we recommend using a MAX_HUB_RETRY (default 10) to handle cases where the Edge/Stack or IoT Hub changed endpoint information.
When devices are using IoT Hub without Provisioning Service, we recommend attempting to rotate the IoT Credentials (SAS Token or X509 Certificate) on authentication issues.
Note: Authentication issues observed in the following cases do not require credentials to be rotated:
1.8.20