Tips & Trends | 6 minutes Reading Time | 2019-11-17

How to transform the data avalanche into insight

In a world of increasingly complex products and faster release cycles, the ability to accumulate and efficiently analyze test data has never been more important.

Due to the conflicting trends of increasing complexity of structures and systems and drastically reduced development times, test labs are under immense pressure to produce test results quicker in order to save costs and reduced development times despite acquiring more data from more sensors. Test engineers are continuously looking at ways to reduce test time and risk. To work faster and more efficiently, these engineers must be able to monitor and respond to test data in realtime regardless of the data volume.

Depending on the type of test, the duration, and measurement frequency, an overwhelming avalanche of data is generated. The challenge ahead is not only to acquire the data, but to store and preserve large volumes of data, and have the ability to access this data for fast continuous online analysis. Large volumes of both structured and unstructured data require increased processing power, storage, and a reliable data infrastructure. When all elements are applied together into a scalable data backend it can greatly improve time to market, reduce costs, and build better products.

Adaptive and scalable data backend

An adaptive and scalable data backend provides a scalable storage and compute platform for acquiring data streams from instruments, storing configurations, and performing analyses.

To cope with constantly changing requirements, setup configuration, parameter extensions, and varying sample rates, a separation between hot and cold data is the best choice. Raw data, data that is less-frequently accessed and only needed for auditing or test post-processing (‘cold data’), is stored in a distributed streaming platform that scales extremely efficiently. If one has to store, process and calculate new variables from hundreds of thousands of samples per second and from hundreds of channels at the same time, this distributed streaming architecture will show its strength and power.

So-called ‘hot data’, measurement data that must be accessed immediately for analysis, is provided in a NoSQL time series database. This database stores data securely in redundant, fault-tolerant clusters. All measurement data is automatically backed up. Flexible data aggregation ensures that measurement data is continuous processed from the streaming platform to the database with predefined datasets for easy data processing of test metrics and KPIs, like mean value, standard deviation, and minimum/maximum. However, the same data can be replayed and aggregated differently in case detailed analysis around a certain test event is required. This approach minimizes the investment and operational cost for IT and storage infrastructure in the test lab, whilst maintaining the necessary computing performance for test-critical data analysis tasks.

Aircraft engine testing is a typical use case where a scalable data backend offers major advantages. Engine testing generates a lot of data, especially when engine transient responses must be recorded. Data rates can vary from 10 samples/second up to 100,000 samples/second. The challenge is to store massive amounts of sensor data, keep it available on a 24/7 basis and allow rapid data analysis. Another example is where a scalable data backend proves its advantages is fatigue testing of large components or fullscale structures. A typical fatigue test program is divided into a number of flight blocks. At the end of each flight block the test is stopped and the test specimen is inspected for cracks. These manual inspections are time consuming and the time interval between these inspections is relatively large. Structural abnormalities may be detected too late and may result into retrofitting in-service aircraft.

“The challenge ahead is not only to acquire the data, but to store and preserve large volumes of data.”

Condition-based inspection of the test specimen, instead of interval-based inspection, is a potential solution to reduce the total fatigue test duration and to quickly detect abnormalities. One of the implications is that more sensors are required to monitor the behavior of the test specimen and to detect or predict structural failures. As a full-scale fatigue test can generate data at rates of up to 10MB/s, totalling to hundreds of terabytes at completion, data processing and analysis have become a major bottleneck.

In order to capture, analyse, and store the rather enormous volume of data, and to ensure its available for applications, Gantner turned to a combination of Apache Kafka (data streaming) and CrateDB (distributed NoSQL database built for IoT/industrial use cases). CrateDB is used for real-time hot storage and Kafka for cost-effective, document-based storage.

“After extensive research and comparisons, we decided to use the combination of Apache Kafka and CrateDB for the design of the data backend,” explains Jürgen Sutterlüti, head of cloud and data analytics at Gantner Instruments. “They are virtually the engine around which the entire concept is built.”

New stream processing platform

Apache Kafka is a messaging system that enables the data received from instruments to be queued and made highly available to follow-on systems. Kafka looks at every single received value and analyses it for current measures, “stream processing”. However, in order to capture, store and make accessible to follow-on systems, the enormous volumes of data requires a database that offers the appropriate performance and interfaces that allow fast and convenient access.

The CrateDB is a new kind of distributed SQL database that improves the handling of time series analysis. The use of SQL as a query language simplifies application and integration, and NoSQL base technology allows you to process IoT data in a variety of formats. The CrateDB can hold hundreds of terabytes of data and, thanks to the shared-nothing architecture within server clusters, guarantees real-time availability without data loss or downtime.

“CrateDB is extremely fast and highly scalable,” says Sutterlüti. “That’s why we use the database to store and access all aggregated measurement data. Working with CrateDB demonstrates the potential that can come from combining edge computing, big data handling, and machine learning.”

Connectivity

Increasingly more test labs use specialised control, monitoring and data acquisition systems. Examples in the field have shown that the lack of integration between these systems still leads to late detection of structural or system anomalies. One of the reasons is that multi-source and/or metadata is not readily available during the test. Measurement data can therefore not be fully analyzed until the end of the test run (or at predefined intervals).

The Kafka stream processing engine comes with an extensive set of APIs to integrate 3rd party data streams, for example from a control system. The primary measurement data can be enriched with control and simulation data. Providing an open software architecture that supports a variety of publishsubscribe-based protocols (like MQTT and DDS), allows seamless integration with other monitoring, analysis, and reporting tools. For test labs that currently maintain automated test systems application programing interface (API) provides a simple way to integrate existing environments. In turn, they’re able to leverage the automatic recording, data storage, plotting, and configuration management capabilities. Users may also programmatically access data that’s stored in the data backend for use with custom graphics, analytics, and report generation applications.

Gantner’s philosophy is to provide customers with open interfaces where they can store data and send that data wherever they want. Sutterlüti says, “Our customers value this option quite a bit because it relieves them from handling data and how to store it. Instead, they have easy access to APIs, and all the existing functions they are used to are still available.”

Rapid analysis

Analysis or failure detection differs by application. Common software and user interfaces for managing, visualizing, reporting, or defining dedicated event rules simplify access and integration. This approach minimizes the investment cost for IT and storage infrastructure in the test lab, whilst maintaining the necessary computing performance for test-critical data analysis tasks. For example, to perform temporal and spatial analysis of aircraft engines, or to better understand the mechanical response of an aircraft structural component, powerful querying capabilities enable engineers to analyze large amount of sensor data on-the-fly.

Trend monitoring over the entire life of the test will quickly signal any significant change in test article response between repetitive test conditions. With an adaptive and scalable data backend solution, Gantner Instruments is able to provide a platform which allows test labs to grasp, monitor, analyze and react on any physical data in real-time and regardless of the data volume, transforming data into insight.

Experience the new platform for modern and robust measurement setups

The GI.bench software platform combines faster test setups, project configuration and handling, as well as visualization of data streams in one digital workbench. It enables you to configure, execute and analyze your measurement and test tasks on the fly. Access live and historical measurements data anywhere.

Success Stories

Wendelstein 7-X Stellarator

After 9 years in construction, the stellarator (nuclear-fusion machine) of the Max Planck Institute in Greifswald generated plasma for the first time on December 10, 2015. The reactor had been gradually ramped up over the past 12 months. The event was given extensive coverage in the media. Gantner Instruments specially developed sensors for temperature and strain measurement for the project, which are otherwise unavailable on the market.

Events

China Windpower 2024

Mark your calendars for an electrifying event—China Wind Power (CWP) 2024, taking place from October 16-18.

Tips & Trends

Systèmes d’acquisition de données portables et mobiles

De nombreuses raisons justifient un système de mesure flexible et robuste qui doit être facile à transporter pour collecter des données de mesure à différents endroits. Il peut s'agir, par exemple, de mesures à court terme sur des machines ou des composants d'usine lors de la mise en service après un entretien, ou de mesures récurrentes sur des ponts ou d'autres ouvrages d'art.

Events

34th Dresden Bridge Building Symposium 2025

We are thrilled to announce the 34th Dresden Bridge Construction Symposium (DBBS), set to take place on March 19h/20th, 2025