Gantner Instruments GmbH, Montafonerstr. 4, A-6780 Schruns, Austria
How to transform the data avalanche into insight
Tips & Trends | 6 minutes Reading Time |

How to transform the data avalanche into insight

In a world of increasingly complex products and faster release cycles, the ability to accumulate and efficiently analyze test data has never been more important.

Due to the conflicting trends of increasing complexity of structures and systems and drastically reduced development times, test labs are under immense pressure to produce test results quicker in order to save costs and reduced development times despite acquiring more data from more sensors. Test engineers are continuously looking at ways to reduce test time and risk. To work faster and more efficiently, these engineers must be able to monitor and respond to test data in realtime regardless of the data volume.

Depending on the type of test, the duration, and measurement frequency, an overwhelming avalanche of data is generated. The challenge ahead is not only to acquire the data, but to store and preserve large volumes of data, and have the ability to access this data for fast continuous online analysis. Large volumes of both structured and unstructured data require increased processing power, storage, and a reliable data infrastructure. When all elements are applied together into a scalable data backend it can greatly improve time to market, reduce costs, and build better products.

Adaptive and scalable data backend

An adaptive and scalable data backend provides a scalable storage and compute platform for acquiring data streams from instruments, storing configurations, and performing analyses.

To cope with constantly changing requirements, setup configuration, parameter extensions, and varying sample rates, a separation between hot and cold data is the best choice. Raw data, data that is less-frequently accessed and only needed for auditing or test post-processing (‘cold data’), is stored in a distributed streaming platform that scales extremely efficiently. If one has to store, process and calculate new variables from hundreds of thousands of samples per second and from hundreds of channels at the same time, this distributed streaming architecture will show its strength and power.

So-called ‘hot data’, measurement data that must be accessed immediately for analysis, is provided in a NoSQL time series database. This database stores data securely in redundant, fault-tolerant clusters. All measurement data is automatically backed up. Flexible data aggregation ensures that measurement data is continuous processed from the streaming platform to the database with predefined datasets for easy data processing of test metrics and KPIs, like mean value, standard deviation, and minimum/maximum. However, the same data can be replayed and aggregated differently in case detailed analysis around a certain test event is required. This approach minimizes the investment and operational cost for IT and storage infrastructure in the test lab, whilst maintaining the necessary computing performance for test-critical data analysis tasks.

Aircraft engine testing is a typical use case where a scalable data backend offers major advantages. Engine testing generates a lot of data, especially when engine transient responses must be recorded. Data rates can vary from 10 samples/second up to 100,000 samples/second. The challenge is to store massive amounts of sensor data, keep it available on a 24/7 basis and allow rapid data analysis. Another example is where a scalable data backend proves its advantages is fatigue testing of large components or fullscale structures. A typical fatigue test program is divided into a number of flight blocks. At the end of each flight block the test is stopped and the test specimen is inspected for cracks. These manual inspections are time consuming and the time interval between these inspections is relatively large. Structural abnormalities may be detected too late and may result into retrofitting in-service aircraft.

“The challenge ahead is not only to acquire the data, but to store and preserve large volumes of data.”

Condition-based inspection of the test specimen, instead of interval-based inspection, is a potential solution to reduce the total fatigue test duration and to quickly detect abnormalities. One of the implications is that more sensors are required to monitor the behavior of the test specimen and to detect or predict structural failures. As a full-scale fatigue test can generate data at rates of up to 10MB/s, totalling to hundreds of terabytes at completion, data processing and analysis have become a major bottleneck.

In order to capture, analyse, and store the rather enormous volume of data, and to ensure its available for applications, Gantner turned to a combination of Apache Kafka (data streaming) and CrateDB (distributed NoSQL database built for IoT/industrial use cases). CrateDB is used for real-time hot storage and Kafka for cost-effective, document-based storage.

“After extensive research and comparisons, we decided to use the combination of Apache Kafka and CrateDB for the design of the data backend,” explains Jürgen Sutterlüti, head of cloud and data analytics at Gantner Instruments. “They are virtually the engine around which the entire concept is built.”

New stream processing platform

Apache Kafka is a messaging system that enables the data received from instruments to be queued and made highly available to follow-on systems. Kafka looks at every single received value and analyses it for current measures, “stream processing”. However, in order to capture, store and make accessible to follow-on systems, the enormous volumes of data requires a database that offers the appropriate performance and interfaces that allow fast and convenient access.

The CrateDB is a new kind of distributed SQL database that improves the handling of time series analysis. The use of SQL as a query language simplifies application and integration, and NoSQL base technology allows you to process IoT data in a variety of formats. The CrateDB can hold hundreds of terabytes of data and, thanks to the shared-nothing architecture within server clusters, guarantees real-time availability without data loss or downtime.

“CrateDB is extremely fast and highly scalable,” says Sutterlüti. “That’s why we use the database to store and access all aggregated measurement data. Working with CrateDB demonstrates the potential that can come from combining edge computing, big data handling, and machine learning.”

New stream processing platform


Increasingly more test labs use specialised control, monitoring and data acquisition systems. Examples in the field have shown that the lack of integration between these systems still leads to late detection of structural or system anomalies. One of the reasons is that multi-source and/or metadata is not readily available during the test. Measurement data can therefore not be fully analyzed until the end of the test run (or at predefined intervals).

The Kafka stream processing engine comes with an extensive set of APIs to integrate 3rd party data streams, for example from a control system. The primary measurement data can be enriched with control and simulation data. Providing an open software architecture that supports a variety of publishsubscribe-based protocols (like MQTT and DDS), allows seamless integration with other monitoring, analysis, and reporting tools. For test labs that currently maintain automated test systems application programing interface (API) provides a simple way to integrate existing environments. In turn, they’re able to leverage the automatic recording, data storage, plotting, and configuration management capabilities. Users may also programmatically access data that’s stored in the data backend for use with custom graphics, analytics, and report generation applications.

Gantner’s philosophy is to provide customers with open interfaces where they can store data and send that data wherever they want. Sutterlüti says, “Our customers value this option quite a bit because it relieves them from handling data and how to store it. Instead, they have easy access to APIs, and all the existing functions they are used to are still available.”

Rapid analysis

Rapid analysis

Analysis or failure detection differs by application. Common software and user interfaces for managing, visualizing, reporting, or defining dedicated event rules simplify access and integration. This approach minimizes the investment cost for IT and storage infrastructure in the test lab, whilst maintaining the necessary computing performance for test-critical data analysis tasks. For example, to perform temporal and spatial analysis of aircraft engines, or to better understand the mechanical response of an aircraft structural component, powerful querying capabilities enable engineers to analyze large amount of sensor data on-the-fly. 

Trend monitoring over the entire life of the test will quickly signal any significant change in test article response between repetitive test conditions. With an adaptive and scalable data backend solution, Gantner Instruments is able to provide a platform which allows test labs to grasp, monitor, analyze and react on any physical data in real-time and regardless of the data volume, transforming data into insight.

Experience the new platform for modern and robust measurement setups

The GI.bench software platform combines faster test setups, project configuration and handling, as well as visualization of data streams in one digital workbench. It enables you to configure, execute and analyze your measurement and test tasks on the fly. Access live and historical measurements data anywhere.

GI.bench - monitor better & test faster

More articles

News & Events

Business Profitability and Edge computing

Our Gantner Instruments partner in the Philippines, DMS Engineering, organized a successful event on the 25th of November, 2022 at CEBU about using Edge Computing capabilities for Asset Monitoring to maximize business profitability.

News & Events

IPERMON Workshop

Improved photovoltaic (PV) system reliability and lifetime output can be safeguarded by advanced performance monitoring solutions integrated with novel data-analytic features and leverage interoperable communicative capabilities.

News & Events

Precise Temperature Measurement with Pt100

Temperature measurement is one of our key competence areas. Here we maintain a position of global leadership. Our advanced 4-channel Q.bloxx A105 measurement module sets new standards in terms of stability and precision. For many different applications, the extremely precise and stable measurement of temperatures is an absolute prerequisite for the control of processes or the success of product innovations. In addition to the essential precision, particularly important aspects of this are the maintenance of stability when changes occur in the ambient temperature and ensuring long-term stability.

Products & Services

Measurement Data in the Cloud and IT Security Aspects

The concept acquires and stores data of many decentralized data acquisition systems like Gantner Instrument’s Q.Station. IT Security is an increasingly important aspect to secure the communication and data transfer between the acquisition system and the cloud server.