Most Reliable Method for Data Comparison, Series Rank

A practical guide to implementing a multi-level ranking


Photo by Carlos Muza on Unsplash

“Success rests not only on ability, but upon commitment, loyalty, and pride.” – Vince Lombardi

In this story, we are going to discuss the power of Rank method in solving complex business problems in an elegant way. Along with a detailed explanation of our business case.

In the retail business, customer loyalty is a key aspect that drives business success. In light of that, the marketing team of our client keeps running novel loyalty programs throughout the year. Our Analytics team provides technical ability to measure loyalty as per various criteria.

As a part of one such program, every month the team selects a hundred top customers and awards them loyalty points. Afterward, the customers can convert these loyalty points to additional product discounts. Besides that, they will get more priority at the billing counters.

In view of this program, the criteria for determining loyalty include three factors

  1. frequency a customer has visited the store
  2. number of products the customer has bought and
  3. amount of money the customer has spent

In Analytics Terms

The task is to analyze the sales data and determine the customers who are good at all of these factors together. Accordingly, rank them. In effect, avoid the influence of the customers who are good only at one of these factors. Ideally, such customers are not considered loyal.


Series Rank Method

All things considered, this is how our solution went. First of all, Rank the customers on each factor independently. Upon doing this, each customer gets 3 ranks, one for each factor. Then, merge these ranks and determine the final loyalty rank.

Rank as a comparing technique

Here are the two most important features of using a Ranking method over other methods of comparison. Let’s examine them first before getting into the actual code.

1. Rank restricts the influence of outlier

The main reason for using the Series Rank method is that it determines only the position and ignores the quantum of difference. Thus it limits the impact of any outliers on its neighbor. Doesn’t cascade it to the entire list. Ranking eliminates the influence of big differences in the scores. It only checks who is better without getting into how much is better.

2. Ranking eliminates the factor bias

Specific to this solution, we are using a multi-level ranking. A candidate may have got an extremely good ranking on one particular factor but no good ranking on other factors. In such a case, combining all the ranks of a candidate, at a particular level and producing a unified score, eliminates the factor bias. This score can be used as an input to the next level of ranking. In detail, we can see this mechanism in the illustration to follow.

3. Rank retains the row positions

Normal sorting changes the row positions. If we sort on multiple columns, only the positions as per the last sorting are retained. In contrast to that, Ranking doesn’t change the row positions. It is very easy to analyze.


Let’s get into solving the business problem that we have discussed in the beginning.

Multi Level Ranking – An Illustration

1. Organize the data

Populated a dataframe from the sales transactions of the month. And it looks as follows:

The Visits column denotes the total number of visits a customer has visited the store during this month. Likewise, the Products column denotes the total number of products purchased. In the same way, the Amount column shows the total amount spent in dollars during this month.

2. Rank at each Factor

By default, Ranking provides a better ranking for the lowest values. Better ranking means numerically smaller rank. In our requirement, we need a better ranking for the highest values. So have specified the argument ascending = False.

Dataframe after calculating ranks for all factors

3. Combine the Score

At this point, we have the ranks calculated for each factor individually. Now that we need to calculate a score as a combination of all these ranks.

Dataframe after combining the score

4. Create Final Ranking

To summarize, we have so far generated the ranks for individual factors, added them, and generated a combined score. At this instant, we are ready to generate the final ranks. That is the loyalty rank as per the criteria defined by the marketing team.

Dataframe after calculating the final rank

With that, we have reached our final goal. The last column of this dataframe shows the final rank. It tells which customers should be awarded the points this month.

Conclusion

In essence, ranking is a very good technique to compare the data elements. It restricts the influence of outliers, eliminates the factor bias, and retains the row positions. The illustration shows how can it be applied at multiple levels.

Write a Comment

Comment