If you want to reorganize time series data at a different frequency and aggregate some of its numeric attributes then resampler aggregate is a good choice.
Let’s see how this method transforms the data
Resampler reorganizes the time series data at a new time interval. As per our analysis needs, we can change the series of data to a bigger time interval (low frequency) or to a lower time interval (higher frequency). It reorganizes the series of data into different groups of successive time intervals. The aggregate method aggregates each column within an interval.
Are there any built-in functions to aggregate the columns?
Yes. Obviously, you don’t have to start from scratch. There is a good library of methods that can be readily used for transformation. Any method that a Series object supports like min, max, sum, mean, etc can be passed as an input parameter to the Aggregate method.
Can we write a custom aggregator?
Yes. You can write a custom function and pass that to the aggregator. Such a function should take a Series object as an input and return either a scalar value or another series object of the same size.
Does it apply on a single column or multiple columns?
Aggregator applies the aggregate function on all the column-indexed attributes. The timestamp attribute is row-indexed and used by the resampler.
Hope this clears some quick questions. Now with this level of clarity let’s look into a business case. The module that we have implemented for a retail client who runs in-store and e-commerce.
The price monitoring system of our web store regularly extracts the product data from various competitor websites. It extracts the raw data at a fixed frequency, once every 2 hours. The program visits the competitor websites one by one and extracts the required product details.
Our price manager needs to determine the price trend for each product from this data. Accordingly, our store has to adjust the prices.
If there is an increasing pattern, then we have an opportunity to increase our profit margin. If there is a declining trend in the prices, we need to be on alert and decrease the prices to prevent our regular customers from dropping out.
But how to determine the average price of a product? As we are fetching the details at different time stamps for different competitors?
The store manager specifies the measurement window as 3 hours. This means all the prices collected in an interval of 3 hours have to be averaged out, irrespective of the competitor. That provides the average price of a product during that interval. Likewise, we need to calculate the average price at each interval and plot the overall trend.
Let’s solve it using the resampler aggregate method.
- First, get the raw data extracted as per the baseline frequency. Product data of various competitors at the specified timestamp (TS).
- Set the time stamp column as the index column for the dataframe.
- Resample the dataframe as per the interval specified by the store manager. That is 3 hours. At this step the dataframe is reorganized into groups of ‘3 hour’ time intervals. Each group is an interval of 3 hours. All the records whose timestamp (TS) falls into this interval will be considered as part of that group.
- Call aggregate method with ‘mean’ as the parameter. Because, our requirement is to calculate mean of all rows, for each column, at each group level.
- The output of the above code will look like this:
Liquid Handwash shows a fluctuating trend. In the window 12:00 pm to 3:00 pm, the average price is $12.92. In the next interval, which is from 3 pm – 6 pm, the price has increased to $13.07. Subsequently, it has decreased in the next window, that is from 6 pm – 9 pm to $12.50.
Moisturizer & Sesame oil have shown a consistent declining trend in price.
Peanut oil’s price is consistently growing.
We have read the product data from various competitors as a time series. Resampled the data to match the measurement window of our analysis. Aggregated the price values of all products and calculated the mean at each interval. Thus we have enabled the store manager to detect the price trend in the market and adjust it to the optimum level. Subsequently contributing to the profitability of the client’s business.
If you would like to share your business case on how you’ve used this method, please send the details to firstname.lastname@example.org. Our editorial team shall get back to you in adding it to this publication. Please contribute to the integration of knowledge towards building a new generation of fast learners. An example speaks a thousand paragraphs.