© 2019 by NorCom Information Technology GmbH & Co. KGaA


Big Data Analysis of driver behaviour

For car manufacturers, data-driven evaluation of driving behavior holds enormous potential for increasing customer satisfaction and cutting costs. In order to make reliable conclusions, statistics must be drawn from extremely large amounts of data.


In this project DaSense therefore provided a Big Data Analytics environment for the evaluation of mass data in the cloud. The data newly stored in the environment via a data load route is automatically converted into a big data analysis format. Particular consideration was given to the cleansing and plausibility of the data.


In addition, created analyzes should be easily transferred to regularly executable and programmatically controllable workflows. These requirements have been addressed by providing an individually configurable, container-based development environment.

Document Management / Semantic Search

Faster, easier, but also secure document exchange and cross-collaboration between different teams and countries are two key success factors for innovation and growth. In addition to a millisecond search EAGLE offers extensive possibilities of data structuring as well as analysis functions to support the work with the managed documents.

In addition to a search index, a big data environment is used, with which advanced insights and machine learning regularly gain new insights from the documents and add them to the index. Thousands of users access the hundreds of millions of documents managed by the system every day, and get results on free text searches in seconds.

The new transparency of the document content promotes collaboration and reduces costs through shorter search times by the staff, as well as, e.g. Recognition of duplicate jobs.

In order to enable autonomous driving, the car must perceive its surroundings without errors and make appropriate decisions based thereon. Technically speaking, the perception of the car results from the real-time fusion of data from a variety of sensors installed in the car. The decision is made by the car with the help of algorithms based on neural networks. These neural networks must be developed and trained to avoid mistakes.


The IT infrastructure for the training of neural networks must on the one hand handle a high volume of data (terabyte range), on the other hand train neural networks with reproducible workflows by distributed development teams.


DaSense creates a deep learning environment that brings data and analysis together and allows automated training of neural networks. Neural networks are developed much faster, trained and verified on mass data. Business interruptions are reduced to a minimum.

Deep Learning with DaSense

RDE – Real Driving Emissions Test

Real Driving Emissions (RDE) describes the real emission behavior of cars, trucks and buses in everyday use. Within the scope of tests, under certain environmental conditions (temperature, speed, terrain, etc.) exhaust emissions are checked in various situations.


The difficulty with the evaluation of the data consists in the sheer size of data records, which are additionally to be set in correlation. With DaSense, we have parallelized the analyzes to allow the customer to scale. In addition, we have developed our own app to simplify the replication of the analysis, each with changing parameters.

Infrastructure for driver assistance systems

Driver assistance systems support the driver of a car electronically in certain driving situations. The goal is usually to increase safety. The development of such systems requires a new architecture for storing, processing and analyzing large amounts of data. This means data must be quickly made available for processing - even if it is generated worldwide or stored locally. Similarly, large amounts of data (petabytes of space) must be able to be processed quickly - by different teams, regardless of location.


To meet these requirements, we have developed DaSense-based automated, robust and scalable workflows for data conversion and automated assessment of abnormalities, as well as integration with advanced analytics and machine learning for self-service processing. Now worldwide generated, large series of measurements are immediately accessible for processing. Abnormalities in processes and data are detected early on and operational failures are avoided. Flexible and customizable analysis solutions ensure the required high speed of innovation.

Operating data for product improvement 

For the further development and targeted improvement of products, the evaluation of operating data is relevant for companies. Goals are the optimization of maintenance times and a return of knowledge for product improvement. The globally distributed devices first transmit their data to the cloud, where initial evaluations are carried out. From there, they are transported to an on-premises DaSense environment for aggregation with inventory data.

Data analysis is interactive on pre-aggregated tables, and aggregations are periodically updated through big data workflows. A particular challenge is the handling of personal data, which requires a flexible set of rules for storage, usage and deletion.


DaSense Advanced Analytics workflows with interactive visualization provide new data-driven insights into device usage. Virtualization and templating enable fast, efficient adaptation to growing amounts of data.

Management of globally distributed data 

In vehicle endurance, more data is recorded in shorter time. As the vehicles are used worldwide, it is becoming increasingly difficult to transfer the resulting measurement data to the head office for analysis.

DaSense uses the Distributed Query Engine component to apply Hadoop's guiding principle, "Bring the algorithm to the data," on a global network of data centers. Scalable data load paths ensure that newly loaded data is automatically quality checked, preprocessed and converted into a Big Data analysis format. Within predefined periods of time, the first evaluations are available on decentralized measuring stations. By networking DaSense instances, the reports can be aggregated across the stations for global analysis.

In the background, a data mover designed for this purpose, continuously transfers the data for backup in the central data store in accordance with data governance rules.


By providing data for evaluation immediately, test runs can be planned more agile and development costs are reduced. Intelligent data management ensures that local data analysis is optimized and existing resources are used more efficiently.

Data analysis in the cloud with DaSense

DaSense takes over the data analysis of an autonomous simulation run in this project. The quality of the autonomous steering of the vehicle should be checked. The measured variable was defined as the steering reversal rate, which shows how frequently and intensively the vehicle has to correct an introduced steering system. After the evaluation, this code is then correlated with other driving parameters to perform a root cause analysis. The customer was first provided with an environment for big data analytics with DaSense on an MS-Azure subscription. The data analysis is done via a data load path for proprietary customer data formats that regularly searches for files and converts them to a big data format (parquet).


The customer received a four-day training course to kick off on-site, demonstrating the creation of an analytics notebook and transferring it to a DaSense app, and is now able to self-service data analysis with DaSense. We then assist the customer in developing additional analytics to answer technical questions and create DaSense apps for recurring queries.

Underground detection with deep learning on image and time series data

The customer develops an autonomous driving device, which is to be equipped for optimal navigation with artificial intelligence. Using deep learning algorithms, the first step is to identify different types of subsurface (for example, concrete, grass, gravel, soil, etc.).

The prototype has ten different sensors that collect image and time series data and compare their quality. The size and resource consumption of the models should be adapted to the prototype.

For this we have provided an on-premise environment for loading, extraction and analysis of Big Data. On this environment, we trained a selection of the most up-to-date deep learning models for the detection of multiple subsurface classes from image and time series data using the transfer learning method on TensorFlow.


Afterwards, the best models were optimized and achieved an accuracy of over 95% with a model size of less than 10 MB. They are now ready for use on the prototype and the further development of navigation. The close cooperation enables the customer to train further models in self-service and to apply them on the prototype.

Predictive Maintenance

In order to avoid possible damage to vehicles, abnormalities should be identified at an early stage and appropriate measures taken. For this purpose, the memory data of faulty vehicles during repair are read and compared with other collected findings. If sufficient data are available, regularities can be identified that will serve to detect errors in vehicles in the future.

In order to be able to make statistically reliable statements, the evaluation of the storage data must take place over a very large number of individual vehicles and surveys over several years. However, the sheer size of the collected metrics is beyond the reach of traditional analytics tools, and requires the use of advanced analytics techniques.

The evaluations were therefore transferred to a specially designed Big Data & Advanced Analytics environment. The data load paths set up allow the scalable embedding of existing domain quality assurance software and the preparation of data for analysis. The existing analysis approaches have been supplemented by a pattern recognition for matching with defined error images and an automated conspicuousness analysis. The transition to regular operation enabled the extension and execution of analysis on new data in self-service and use throughout the development department.

As a result, a measurement dataset collected over several years can be searched for patterns and abnormalities within a few minutes instead of several days. The established statistical evaluations on the vehicle fleet support the early detection of errors and provide valuable information for data-driven vehicle development.