How to Cultivate Analytics at Work by Transforming Data Instantly, With no Lag

A step by step guide to use pandas in exploratory data analysis

Reorganizing the data promptly and effectively is the key to an efficient exploratory data analysis.

In order to build an intelligent layer of operational analytics for any business, you would essentially need two things.

  1. good knowledge of the business context and
  2. a deep understanding of the methods that aptly support the data transformation needs.

This article covers both. First, it explains an important retail business context related to product pricing and placement strategies. And then demonstrates the merits of the group by technique in addressing the data transformation needs of that context.

In this process, you shall get a deep understanding of how to use the Transform and Aggregate methods of Pandas Group By object, in supporting various critical operational decisions.

What is a group by and why is it important?

Group By is a procedure of re-organizing the data in a way that is more meaningful to the insight we are exploring. It doesn’t change any of the values but it changes the logical grouping of the rows and defines a group index for each group.

It is very much important because the raw data is produced from a single point of view. With only one key and a set of values. But the exploratory data analysis requires multiple points of view for the same dataset. So, we need a mechanism that translates the ordering of the raw data temporarily into a form that is required for the analytics in hand.

Now with this level of clarity let’s look into business scenarios. The module that we have implemented for a retail client who runs in-store and e-commerce.

1. Data Transformation in Pricing a Product

Business Context: 

This problem is related to the cost distribution of the products. It needs to be quick and accurate in order to set the correct selling price for the individual products.

Whenever a supplier delivers the merchandise at the store, the store manager pays the transportation costs to the supplier. Divides that cost and adds it as an indirect cost to every unit of the product delivered in that batch. Transportation cost differs from supplier to supplier.

It is very important to revise the cost of the product at every stage of the supply chain. Else, the business would incur a heavy loss if some cost elements are missed while pricing the product.


Transform is the better choice when you have to derive a new column from one existing column. We have grouped the data by Supplier Index and then transformed the purchase cost.

By using the Group By Transform method we are able to instantly transform purchase cost into an indirect cost and revise the unit cost of the product. Thus contributing to the accuracy of pricing.

Please find, a detailed explanation of the solution, and a code walkthrough is presented at this link. 

An Excellent Method for Instant Data Transformation, DF GroupBy Transform

2. Data Aggregation to Position a Product

Business Context:

Assortment strategies work hand in hand with the pricing strategies. They are very important to maximize sales because the customers directly interact with the product mix shown on the display and make purchase decisions based on what they see.

The client wants to ensure that the products in a display will always fall into the same product range. The maximum price difference should not exceed more than 25%.


We have grouped the products as per their display id. Then by using the Aggregate method we have calculated the price range in each display unit. Here display unit acts as the group index. Classified the products on the basis of their fitment into the price range. Further made recommendations on how to adjust the price of the products in order to continue them in the same display unit.

Please find, a detailed explanation of the solution, and a code walkthrough is presented at this link.

Practical Guide to GroupBy Aggregate in Pandas: Solving a Retail Problem


With the instant transformation of the indirect cost, we have enabled the price strategist to prevent the cost leaks. Likewise, by implementing the custom logic in the data aggregation, we have enabled the store manager to define the display units at an ease and increase the overall profitability.

Write a Comment