(This is Part 2 of series of three parts. In Part 1, we saw the common building block for an IoT platform at high level. In this part, we will see the details of each block and technology choices for implementing those)
IoT platform consists of the following significant physical and logical components/layers.
Interfaces for Devices and Operator Administrator – Platform provides interfaces for devices and operator administration. Operators use REST/HTTP interfaces for provisioning devices. Devices will need multiple IoT protocol interfaces in addition to REST/HTTP. Operator interface allows management of products and devices, location/sites and users. It also allows creation of rules or policies for analyzing payload sent by devices. This layer is implemented using REST/web services engine and protocols adapters or bridges are created for translating to/from other protocols. A rule ties a product (device family or group), device location, data attribute(s), and matching conditions/values together to an action that needs to be taken.
Messaging Broker – Device facing interfaces hand off incoming messages to this layer. Usually this layer is implemented with a message broker technology like, RabbitMQ, ActiveMQ and ZeroMQ. Sometimes there could be chain of brokers like, RabbitMQ in the front passing the messages to Apache Kafka for the purpose of scalability. Federated RabbitMQ configuration can also be used to achieve desired scaling.
Storage Layer – Messages are optionally transformed and stored in some persistent storage. NoSQL databases (MongoDB, Cassandra and HBase) fit this requirement very well, as they provide required flexibility to support different message formats and can support web-scale deployments. SQL databases with partitioning and clustering capabilities are a good choice too. Using “search capable” NoSQL options like MongoDB can be quite handy here.
Analytics Layer- There are following two types of analytical processing
Streaming Analytics Layer checks for matching rules as soon as a message is received and hands off to appropriate action handlers if the rule conditions are met. The popular choices here are streaming analytics tools like Apache Spark and Apache Storm. This is also called as “Hot” path. Example – if temperature drops below 0°F, then send a text message to a certain number. The rules can be statically defined. They can also be dynamically defined using machine learning algorithms based on the historical data or trained data.
Batch or on-demand analytics – This is a “Cold” path. Analytics processing happens over stored data in scheduled (batch jobs) or on-demand (ad hoc reports) manner. The goal is to generate summarized or aggregated view of the data so that the reporting layer can consume it. Apache Spark and Hadoop technologies could be used for this layer. Other choice is ETL and OLAP tools. Please note that Spark could have an advantage as it can be used for both Streaming and Batch processing.
Data Consumer APIs – This layer exposes persistent data in consumer application friendly APIs in REST/HTTP format. If Storage layer is implemented in MongoDB, then this layer will act a wrapper over MongoDB’s CRUD APIs. Some implementations also use search infrastructure like ElasticSearch in this layer, which is synchronized with Storage often. This helps a lot as many of the consumer application requests can be directed routed to ElasticSearch. This layer implements Role Based Access Control.
Reporting, Dashboards Layer- This layer allows operator create various reports (canned, ad-hoc etc.) and their scheduling, delivery and formats etc. Multiple Open source or commercial reporting libraries can be used here.
In next part, we will lists down key differentiating features of an IoT platform indicating maturity or customizability of the platform.