In our first two posts in this series on building and IoT platform using open-source components, we have looked at the “Whole Being Greater than the Sum of the Parts” and then at “The Parts that Make the Whole, Whole.” In this third article, Umesh Puranik and I will describe an application-level communication architecture to transfer information between various components of the platform. In terms of the four basic functions of an IoT system (Data Capture, Data Transfer, Data Processing / Analysis, and Action), this is the Data Transfer function, and it operates above the physical layer protocols (both wired and wireless) that we covered in a previous article. Let’s jump right in.
Messaging-Based Architecture
Data generated by the many sensors and devices of an IoT system typically needs to be delivered to the storage and analytics systems using open protocols and communication models such as client-server or peer-to-peer. The client-server model is simpler and more suitable for systems where data from a large number of devices is funneled into a pipeline for further processing. Protocols like MQTT(Message Queuing Telemetry Transport), CoAP (Constrained Application Protocol), AMQP (Advanced Message Queue Protocol), and XMPP (eXtended Messaging and Presence Protocol) are good candidates for such scenarios. The peer-to-peer model is more complex and is suitable for mission-critical systems or m2m scenarios where many devices need to share data with many other devices with stricter quality of service (QoS) or latency constraints. DDS (Data Distribution Service) is such a protocol.
General Internet protocols like HTTP and REST are often too heavyweight for the sensor side of IoT platforms, in terms of overheads, processing and memory requirements, data distribution models, coupling of systems, etc. (They do, however, play a major role in the communications between the gateway and the storage/analytics back-end.) Recall that the sensor platforms in the IoT are generally resource-constrained and hence need protocols with a small footprint and low overhead that can perform reliably and well in low-bandwidth and high-latency networks. MQTT and CoAP are specifically designed with the above requirements in mind. Both of them use a client-server model, run on IP networks, provide asynchronous (non-blocking) mode of operation, and have support for QoS and security.
MQTT is a messaging protocol based on the publisher-subscriber (“pub-sub”) model. It uses TCP as the underlying transport protocol. It supports one-to-many and many-to-many communication modes through a central message broker. The packet format is simple and compact. This makes MQTT efficient even in low-bandwidth or high-latency networks. MQTT is data-agnostic, meaning that the broker does not need to interpret the messages that are flowing through it. The number of API calls that need to be supported is very small, simplifying the implementation. It provides three different levels of QoS. Security is provided by TLS/SSL. The standard MQTT protocol uses TCP, and an MQTT client maintains a long term connection with the server (broker). MQTT-SN (MQTT for Sensor Networks) is a variant of the MQTT protocol which runs on UDP/IP and does not require clients to maintain a connection with the broker. MQTT clients are very simple, while the broker is more complex because of the functionality (message routing, maintaining connections, etc.). Generally, the sensor platforms act as messaging clients.
CoAP also uses a client-server model and runs on UDP/IP. It is a document transfer protocol like HTTP, but is much more lightweight because of smaller and simpler packets. The number of API calls that need to be supported is very small. CoAP is interoperable with HTTP and REST and supports content negotiation between the client-server and device discovery. (MQTT lacks these features.) In contrast to MQTT, reliable data transfer and reordering of datagrams is handled by the applications that use CoAP. As a result, CoAP needs more processing power than MQTT if used on a sensor platform, although it does provide more features and direct interoperability with the web.
In order to choose the most appropriate protocol for a given scenario, one needs to consider the trade-offs between complexity, processing power, and interoperability. We will use MQTT for a general-purpose IoT system that has data from many data producers (e.g., devices, sensors) that is distributed to many data consumers (e.g. storage systems, analytics platforms). The pub-sub model allows for the decoupling between producers and consumers and supports both one-to-many and many-to-many data distribution models. MQTT has strong open-source support: brokers (e.g., Mosquitto), clients (e.g., Paho) and client tools are all available in a wide variety of programming languages.
In the above diagram, the sensor platform is a messaging client and acts as both a publisher (for sending data) and a subscriber (for receiving actuator control commands). Depending on the actual sensor platform to be used (e.g. Arduino, Intel Edison, or a PC), an appropriate open-source implementation can be used. The storage systems (databases) and the analytics engine can be interfaced with custom or open-source MQTT client implementations for receiving data from the sensor platforms and sending back actions (e.g., alarms, alerts) to it. The sensor platforms, storage systems, and analytics engine can all be configured as subscribers for device discovery, configuration, maintenance, etc. While MQTT does not directly support these functions, they can easily be implemented at the application level.
Having selected a messaging architecture for the data transfer between IoT platform components, we will look at the data processing and analytics functions in the next article in this series.
Image Credits: Kansas City Symphony
Dr. Siddhartha Chatterjee is Chief Technology Officer at Persistent Systems. Umesh Puranik is a Principal Architect at Persistent Systems. We thank Sachin Kurlekar for his insightful comments.