By Constanze Schmitz and Valentijn de Leeuw
When the Internet started it had 11 nodes, today this number is estimated to exceed 15 billion. Consumers access the internet through mobile phones, PCs, tablets and their TV. Industrial companies connect increasing numbers of equipment ranging from computers and machines to field devices and sensors. As a result the collected data has grown exponentially over the last years and is expected to grow further to support developments such as increased use of sensors sparked by Industrial Internet of Things (IIoT) and Industrie 4.0. Until the end of the 1990s data flows were predominately outbound, while these days data flows in increased quantities in both directions. While storage and computing power have become relatively cheap, bandwidth can be a limiting factor .
For decades, there were time-series databases for historical industrial data and relational databases such as Oracle or Microsoft SQL Server for other types of data, where data is structured and added to linked, pre-defined tables and scalability is vertical, meaning more data demands a bigger server. While this technology has served businesses well over decades, today’s data world of Industrial IoT, or Big Data with great variety, velocity and volumes requires a different approach, specially when analysis need to be done fast, and latency to operational decisions must be reduced.
MongoDB is a dynamic NoSQL (“Not only SQL”) database which allows for horizontal scalability over different servers, thus supporting easy growth and secure back-ups. It uses dynamic schemas (BSON), making the integration of data easier and faster, allows for flexible index support and rich, full text queries. Replication supports high availability and advanced security, while Map-Reduce can be used for batch processing of data and aggregation operations.
MongoDB scales horizontally using sharding. A shard is a node of a cluster with at least 2 replicas, master and slaves. MongoDB can run over multiple servers, balancing the load and/or duplicating data to keep the system up and running in case of hardware failure or when maintenance is needed.
The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges and distributed across multiple shards. Automatic configuration is easy to deploy, and new machines can be added to a running database. The database can automatically balance the nodes deployed in geographic regions to local usage and replicate or migrate date to where it is needed.
In last month’s event, “Internet of Things Cologne 2015”, Timo Klingenmeier of Inmation and Michael Saucier of Transpara pointed out the need to unify industrial data in a single database and to create visualization in a standardized way for each application. system:inmation is based on a MongoDB database, and connects with SCADA, controllers, and field devices through OPC-UA. Transpara adds high-performance visualization, for any type of fixed or mobile visualization device.
With the IT/OT convergence this brings about, the presenters suggested that merging IT and OT departments is necessary. They also believe other silos (application silos), boundaries (of self-sufficient systems) and limitations (lock-up by vendors) should be phased out in favor of fast and transparent data, unlimited storage, and a workforce that can make operational decisions because they are well-informed. To become agile, companies need to be transparent and use fast loops of observation, orientation, decision and action
Valentijn de Leeuw, ARC Advisory Group’s Vice President commented: “The combined strengths of MongoDB, inmation and Transpara results in an amazingly agile, scalable and easy to use operational data archive and easy to use operational intelligence tool. As an add-on cloud-based platform for visualization and analytics it can offer complimentary added value on top of other historian and visualization applications”.