How to Join Arrays and Predict Problem Location, Numpy Outlier

A step by step guide to use pandas in exploratory data analysis

If you want to combine the arrays like a cartesian product and detect problem patterns, then Numpy Outlier is a good route.

This method combines the arrays like outer join and applies a function on each pair of elements. The resulting cartesian product can be used as a baseline. When compared linearly with the new observations, the Numpy reveals the problematic patterns detected in the observations.

Let’s see how we can use it in troubleshooting mechanical problems. Implemented this for a client who performs Machine Health Monitoring Services in the automotive industry.

Photo by Carlos Muza on Unsplash

Domain Notes

For those who are new to the vibration analysis field, here is a quick overview.

Skip this section, if you have already read any of the following articles:

An Easy Way to Detect Data Abnormalities, Numpy Accumulate

How to Detect More Data Patterns with Numpy Reduce

Better Data Filter for Custom Logic, Numpy At

Numpy ReduceAt, Simple Method to Isolate Problem with Data Slices

What is machine health monitoring?

In general, from chocolate to an aircraft, anything that is produced in a factory needs several machines to operate in synergy. However, if a particular machine in the critical path goes down it affects the complete chain of production. As a result, it incurs a huge loss to the company. So, it is very important to keep monitoring the health of each machine.

As a matter of fact, many kinds of machines like grinding machines, compressors, or refiners keep producing some vibrations while they are running.

Machine health monitoring aims at predicting the faults of a machine before they occur. All in all, this stream of science periodically checks the machine conditions, analyses the data and predicts the failures.

What is vibration analysis?

When humans feel sick, they express their pain by voice to alert others and to communicate with a doctor. But how can machines do that? Of course, they too follow a similar way.

When there is a problem developing inside a machine, the machine expresses it by making different(abnormal) sounds/vibrations. Following that, a team of experts would analyze these vibrations to detect the nature and cause of the fault.

Vibration analysis is a growing field of data analytics. It has great potential in driving the early detection of machine failures and saving enormous failure costs.

What do the field engineers do?

Our field engineers love the machines like pets. They regularly visit the factories, read the machine condition send that data to our analytics team.

Basically, they use the devices like a vibrometer or an accelerometer and read the machine vibrations. In effect, these devices translate the vibrations into various mechanical attributes like velocity, acceleration, RPM of a machine.

What does the data analytics team do?

Well, the data engineers at the data analytics team enrich the attributes of the datasets. As soon as they receive the datasets from the field engineer, the data engineers transform the machine readings into more statistical measurements. And the enriched data can be easily interpreted by business analysts.

What do the business analysts do?

Here the business analysts are typically mechanical engineers. They understand well about the machine internals. From the statistics reported by the data analytics team, they determine the problem location. Subsequently, they prescribe the corrective action.

Business Scenario

As a part of our machine health monitoring service, our team audits several Product Line machines every month. These product line machines are operated using gear boxes. Gear boxes are the most critical components of several machinery which are also susceptible to frequent failures. Mainy due to the rough handling of the new joinees.

Where does the current analysis stand?

Gearbox typically contains two wheels. One major wheel and one minor. If there is any tooth damaged inside the gearbox, our regular vibration analysis detects that problem in general. It doesn’t identify, the tooth of which wheel is damaged.

What needs to be done by the data analysts?

The data analysts need to analyse the machine data one more level down. And detect which wheel is having the damaged tooth. Thus it helps in organizing the parts replacement faster, even before opening the gearbox.

What is the procedure to detect the damaged tooth?

Let’s take for instance, the major wheel has 12 teeth and the minor wheel has 6 teeth.

This is how the gearbox works, in simple terms:

When the gear is applied, both wheels come in contact with each other. A pair of teeth, one from major wheel and one from minor wheel will be in contact first. As the wheels rotate, the next pair of teeth will come in contact.

Eventually, every tooth of the major wheel contacts a tooth of the minor wheel, once in 12 contacts. And every tooth of the minor wheel contacts the major wheel, once in every 6 contacts.

Will the vibrations be different when the gear is applied?

Obviously yes. The vibrations are low when the wheels rotate independently (gear not applied). But the vibrations are amplified when both the wheels come in contact (gear applied). Because one tooth pushes another tooth and that creates some friction.

Amplication factor: The factor by which each tooth amplifies the vibrations when it contacts a surface.

The amplification factors are available for each tooth and they are different from each other. They are derived from a healthy machine and baselined.

Compare the new vibrations with the baselined factors and detect the pattern of deviation. The pattern of deviation should indicate the wheel having the problem. If the deviation occurs once in every 12 contacts, then the problem is with the major wheel. If it occurs once is 6 contacts, then it is with the minor wheel.

Let’s see, how we can solve it using Numpy Outlier method.


  1. These are the amplification factors for each tooth. Specified for both the major and the minor wheels. These values are derived time to time from a healthy machine.
  1. Calculate the net effective amplification factor when a pair of teeth come in contact. When the first tooth of major wheel comes in contact with the first tooth of the minor wheel (1,1) then the net effective amplification factor is
    1.08 X 1.01 = 1.098. Like wise for the pair (5,2) it is 1.06 X 1.02 = 1.0812
  1. Let this be the baseline factors.
  1. Now run the machine and capture the vibrations. Let the following set be the observed amplifications in our first observation.
  1. Subtract the observed values from the baseline values.

Analysis: The resultant matrix above shows that the observed values are deviating from the baseline values, once in every 12 observations. All other values are showing 0 or negligible deviation. Hence it proves that the problematic tooth is with the major wheel. Had it been with the minor wheel, the deviation frequency would have been one in six observations.


We read the amplification factors in two arrays. Using Numpy Outlier method we have calculated corresponding values when a pair of teeth come in contact. Defined this as the baseline. By subtracting the new observations from the baseline, we have discovered that the deviation is happening once in every 12 intervals. Thus found that the problematic tooth causing the deviation is with the major wheel.

What Next?

If you would like to share your business case on how you’ve used this method, please send the details to Our editorial team shall get back to you in adding it to this publication. Please contribute to the integration of knowledge towards building a new generation of fast learners. An example speaks a thousand paragraphs.

Also Read

An Excellent Method for Instant Data Transformation, DF GroupBy Transform

How is GroupBy Rank used in Pricing Intelligence?

Write a Comment