Moving Averages in Spotfire

This week a user contacted me for assistance setting up a 3-month moving average calculation.  He’d already attempted it, but the result was wrong.  This is a common problem with the moving average function because of the way it’s built.  That’s not to say that it’s built wrong.  It’s just wasn’t built the way he wanted it to be built.

To explain, I will begin with an example of the Moving Average aggregation used on the y-axis of a visualization because it’s the easiest to understand.  Then, I’ll move on to a moving average calculation in a calculated column, which is a bit different.

Example of Moving Average Written on Y-Axis

The first screenshot below shows the configuration of the bar chart below it.  We are using the Moving Average aggregation and have chosen an Interval size of 3.  The actual expression makes use of the Last Periods node navigation method.  Note, Spotfire uses the term “Last Periods”.  A period is whatever you put on the x-axis, whether that be days, weeks, or months.  In our example, a period is a month.

In it’s simplest form, the expression sums up oil prod then averages it over the last three periods on the x-axis.   However, it’s a bit more complex because there is also an If statement after the average.  The If statement is counting periods on the x-axis.  A result is returned only when the count is 3.  If the result is not 3, null is returned.  That’s why the result is null until the visualization makes it past 3 periods of data.

 

When you are using a similar expression in a calculated column, it works a bit differently.

 

Example of Moving Average as Calculated Column

In this example, I am going to use a 3-day moving average rather than a 3-month moving average.  The premise is the same.    I calculated the 3-day moving average with this expression:

Avg([Gas Prod]) over (Intersect([Well Name], LastPeriods(3,[Prod Dt])))

That expression says — Average Gas Prod for each Well Name over the last three periods as defined by the Prod Dt.  In this case, Prod Dt is a day of the month.  Thus, the expression will average Gas prod for each Well Name for the last three days.  Here is the data:

As you can see, Spotfire is taking the first day of gas prod and dividing by one.  Then it adds day 1 and day 2 and divides by 2.  Thus, the first two days aren’t really a 3-day moving average.  This may work for you or it might not.  If you don’t want to see the average until 3-days have passed, simply add 2 more calculations.  One is a counter for the days.  The other is an if statement. This will return null until 3 days have passed, just as the previous example did.

Counter — Rank([Prod Dt],[Well Name])

3-day moving average — If([Count Days]<3,null,Avg([Gas Prod]) over (Intersect([Well Name],LastPeriods(3,[Prod Dt]))))

Hopefully, this clarifies how the function works and also how to use the Last Periods node navigation method.

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

4 thoughts on “Moving Averages in Spotfire

  1. Ravi Tripathi Reply

    Hi,

    I need to calculate column where first need to add revenue for last 7 days. I have 21 other columns which need to work as filter so I applied a calculation with all these 21 column in intersect along with Lastperiods(7,[date]): To clarify more:

    Sum([revenue]) over Intersect(col1, col2, col3,…..col21, Lastperiods(7,[Date])).

    Now, it is resulted into missing several records. I analyze and then found first it filter for col1 and then it works for other column which resulted into wrong value.

    Let me know if you are getting some idea.

    Rgards,
    Ravi

    • Julie Sebby Post authorReply

      I think I understand what you are saying. Let’s say that col 1 = customer, col 2 = region, and col 3 = city. Then you have an expression like this — Sum([revenue]) over Intersect([customer], [region], [city]), Lastperiods(7, [Date]). That is going to give you the revenue for the last 7 days for each customer in each region in each city. Within the scope of that more detailed example what would you want it to do? Using over is like saying “for each”. It’s not a filter per se. It’s more of a grouping. Can you be a bit more specific about what you are trying to accomplish?

  2. Kaitlin Soehner Reply

    Hi! How would I do a % difference in production from today’s production vs. yesterday’s production. I have a similar data set with a lot of wells and lots of days of production as provided above. How would I, instead of averaging, just find the difference in today’s production and yesterday’s divided by yesterday’s. Or % Diff = (Today BOPD – Yesterday BOPD)/Yesterday BOPD

    • Julie Sebby Post authorReply

      Do you want to create a calculated column or do you want to write the expression on the y-axis of a visualization? The answer is different depending on which method you use. If the answer is y axis, what visualization type.

Leave a Reply

Your email address will not be published. Required fields are marked *