Everything You Need To Know About Series Rolling Method


Decisions are exponentially informative when rolling arithmetics are used.

My roommate is working as an assistant store manager. This week I found him very busy, working overtime almost every day. When asked, he said that he is having a tough time with the ever-changing needs of his manager.

Photo by Christina on Unsplash

Oh! that’s a common problem with many managers. Thought, if there is any possibility I should be able to help. And asked him to elaborate on the problem. Looks like he is looking for someone to at least listen to the problem, he immediately started explaining.

Here comes the problem

This festive season we are offering flash discounts on some selected grocery items to our online customers. Flash discounts are active for a fixed time. They are shown on the web screen only during their active period. First, we started with some flash discounts of 30 minutes duration. We need to find out, which window of 30 minutes is best in a day to display an offer.

My system receives the consolidated sales data every minute. I’ve written a script that consolidates the last 30 minutes of sales data every minute. From this, we conclude which items are selling best in which slot of 30 minutes in a day. Accordingly, we display the offer related to those items in that high traffic window. This is to maximize the number of people who can avail of that offer.

Well, interesting. What’s the pain point here? I asked

So far so good. But in no time, my manager has extended the offer from 30 minutes to 45 minutes duration. I’ve modified the program to consolidate the last 45-minute data every minute. And the very next day, he has introduced a couple of more offers on the same items. One offer is at a 30-minute duration and the other at a 60-minute duration. I had to modify my script at several locations to accommodate multiple consolidations at different time intervals. Now the offer is getting extended to several other items.

Now the script is becoming very complicated. The window durations are continuously changing and more offers are getting added day by day. So I had to spend long hours ensuring the script doesn’t break with the changes.

After listening to my friend, I felt like there is a better way to solve it. By using the Pandas Series object. And we started solving it with some sample data.

Solution, Hand in Hand

Let’s see if the ‘rolling’ method of Series object is a good fit as your solution.

I’ve asked him – take all your sale data of the day into a series. The data as it looks at the end of the day. [For demo simplification, am showing here the data of 2 hours]

series =  pd.Series([25, 31, 40, 39, 15, 0,  6, 47, 39,  6, 49,  3,  1, 44,  3, 12, 25,
       23, 20, 30, 36, 24, 16, 36, 37, 38, 36, 17, 23,  0, 21, 19, 40, 11,
       28, 16, 45, 29, 47,  0, 44, 30, 21, 42,  5, 31,  2, 43, 29,  1,  8,
       27, 36, 16, 41, 0, 32, 10, 30, 34, 40, 13, 18, 21, 13, 10, 30,  2,
       46, 44, 48, 21, 13, 26, 34, 28, 34, 37, 11, 49, 19, 43, 11, 10, 49,
       21, 12, 49, 19, 32, 14, 28,  8, 41,  3, 47, 36,  3, 43, 39, 14, 46,
       17, 48, 43, 18, 21,  3, 14, 18, 46, 24,  3, 35, 24, 25,  9, 32, 43,
       46])

Roll it up as per the desired window duration. Say 30 minutes.

rollingsum = series.rolling(30, win_type ='triang').sum().round()

Now you can see the rolling sum of every element in the resultant series.

Rolling Sum | 30-min Window
30 Min Window | Best Slot : Minute 61-91

It’s clear from the array and the picture that the maximum value (423) is at the minute 91. So the best window to display this particular offer (30-min offer) is from minute 61 to 91.

On the Flexibility

Now, let’s get to the actual pain point. If your manager wants to calculate the window at different durations, then the only thing you need to do is to change just one line of code.

Change series.rolling(30, win_type =’triang’) to series.rolling(60, win_type =’triang’)

That’s it, you have a different time-series analysis instantly done.

60 Min Window | Best Slot 49-109 Minute

The Distinction

Like that, you can plot any number of windows on one set of data. Your manager can make a more informed decision. Moreover, you have to execute this only once a day. Not every minute. Doesn’t it save more time and effort offering better flexibility?

Conclusion

Time series analysis involves a lot of trial and error. So the requirements keep changing especially when it comes to the window boundaries. Better not to write window-specific code. Pandas Series offers an easy, compact, and sophisticated way to perform rolling sum on time series data. By using this method you can make your code more flexible, accurate, and easy to maintain.

Also Read

Write a Comment

Comment