Numpy ReduceAt, Simple Method to Isolate Problem with Data Slices

A step by step guide to use pandas in exploratory data analysis


If you want to cut the data into ad-hoc slices and aggregate their values towards a deep comparative analysis, then Numpy ReduceAt is the best choice.

This method takes the comparative analysis to a next level. It provides complete control to the analyst for specifying the slice boundaries, in a minimalistic way and then aggregates the values inside each slice.

Let’s see how we can use it in troubleshooting mechanical problems. Implemented this for a client who performs Machine Health Monitoring Services in the automotive industry.

Domain Notes

For those who are new to the vibration analysis field, here is a quick overview.

What is machine health monitoring?

In general, from chocolate to an aircraft, anything that is produced in a factory needs several machines to operate in synergy. However, if a particular machine in the critical path goes down it affects the complete chain of production. As a result, it incurs a huge loss to the company. So, it is very important to keep monitoring the health of each machine.

As a matter of fact, many kinds of machines like grinding machines, compressors, or refiners keep producing some vibrations while they are running.

Machine health monitoring aims at predicting the faults of a machine before they occur. All in all, this stream of science periodically checks the machine conditions, analyses the data and predicts the failures.

What is vibration analysis?

When humans feel sick, they express their pain by voice to alert others and to communicate with a doctor. But how can machines do that? Of course, they too follow a similar way.

When there is a problem developing inside a machine, the machine expresses it by making different(abnormal) sounds/vibrations. Following that, a team of experts would analyze these vibrations to detect the nature and cause of the fault.

Vibration analysis is a growing field of data analytics. It has great potential in driving the early detection of machine failures and saving enormous failure costs.

What do the field engineers do?

Our field engineers love the machines like pets. They regularly visit the factories, read the machine condition send that data to our analytics team.

Basically, they use the devices like a vibrometer or an accelerometer and read the machine vibrations. In effect, these devices translate the vibrations into various mechanical attributes like velocity, acceleration, RPM of a machine.

What does the data analytics team do?

Well, the data engineers at the data analytics team enrich the attributes of the datasets. As soon as they receive the datasets from the field engineer, the data engineers transform the machine readings into more statistical measurements. And the enriched data can be easily interpreted by business analysts.

What do the business analysts do?

Here the business analysts are typically mechanical engineers. They understand well about the machine internals. From the statistics reported by the data analytics team, they determine the problem location. Subsequently, they prescribe the corrective action.

Business Scenario

As a part of our Machine Health Monitoring service to a chocolate factory, our field engineers audit various chocolate grinding machines every month.

During the audit, we do the fitment assessment of the machine for the next 6 months. All in all we need to determine, is there any deviation in the motor performance that can break down the machine in the near future?

How do you do the fitness assessment?

The fitness assessment is done by comparing a machine’s performance with a baseline. The performance of a healthy machine under the same conditions will act as the baseline. The factor of comparison is the motor speed measured in RPM (rotations per minute).

Chocolate grinding is an automated process. In that, various ingredients are added to the container at some fixed time stamps. By adding these ingredients we change the load on the motor. In other words, every time a new ingredient is added, that is considered as a new phase of running for the motor.

The actual readings and phases in the production would be much more but for the sake of this discussion, they are simplified to 20 readings and 4 phases.

  • First phase : milk & sugar added at time index 4
  • Second phase flour added at time index 9
  • Third phase : cocoa butter added at time index 14 and
  • Fourth phase : dry nuts and other solids added at time index 18

How do you say the machine is at fault?

In essence, the cumulated abnormality in the motor speed should not be more than 100 units in any of these phases. That is to say, if the cumulative abnormality exceeds this threshold value in any of these phases, then we classify that the machine has a problem developing inside. Due to that there is a possibility that the motor may break down in near future.

What should the data analytics team do?

Provide a mechanism where the business analysts can easily change these phases & threshold values and instantly assess the fitment for several machines and several datasets.

Well, let’s see how we can solve it using Numpy ReduceAt method.

Solution

  1. Firstly, read the speed details of the baseline motor and the motor that we are auditing. Here the index denotes some common time interval at which the readings are taken.
  1. Calculate the abnormality at each reading.
  1. Now define the phases as specified in the problem statement.
  1. Now we are going to ask the ReduceAt function to slice the data as per these phases into 4 slices. First slice from index 0 to index 4 (with milk & sugar), second slice from index 5 to index 9 (with flour), third slice from index 10 to index 14 (with cocoa butter), and fourth slice from index 16 till the end (with dry nuts and solids)

It has immediately calculated that and reported. See, at the third phase, the value (115) is found to be greater than the permissible threshold (100).

Analysis: As there is a developing problem detected at the third phase, it is recommended that the motor needs immediate maintenance to avoid breaking down.

Summary

We have read the RPM details of two motors (derived from vibration readings). Derived the abnormality at each time index. Sliced the data as per the operational phases of the motor and calculated the cumulative abnormality at each phase using Numpy ReduceAt. By applying this method, the business analysts have detected the growing fault in the machine. The business analysts can quickly change the thresholds and phases as required and apply this technique on several other machines.

What Next?

If you would like to share your business case on how you’ve used this method, please send the details to pub@additionalsheet.com. Our editorial team shall get back to you in adding it to this publication. Please contribute to the integration of knowledge towards building a new generation of fast learners. An example speaks a thousand paragraphs.

Also Read

An Excellent Method for Instant Data Transformation, DF GroupBy Transform

How is GroupBy Rank used in Pricing Intelligence?

Write a Comment

Comment