Type curves are an important part of resource assessment of an oil and gas asset. In this workflow, well declines are aggregated to determine typical behavior of a well ensemble. These well ensembles usually reflect a reservoir or set of analog reservoirs that will help determine characteristic behavior. In this post, we will build a decline model for the group of wells that is called a type curve. The type curve will capture the production rate forecast for a single “average” well and so can be used to determine Estimated Ultimate Recover (EUR). Best yet, we’ll do it in 20 minutes. A little longer than GEICO, but you’ll save so much more money.

There are countless tools that support the type curve workflow. There are also many levels of sophistication around type curve prediction, handling different flow regimes, decline characteristics, and uncertainty. Since type curve generation is so central to resource assessment, though, it’s useful to be able to merge it with the adhoc analytics capability of Spotfire. So useful, in fact, that it will open up a lot of different pathways for improvements of the resource assessment process especially for unconventionals. But that’s for a different post.

At the end of the day, anyone who has done type curve fitting knows it’s as much of an art as it is a science. The statistical best fit may not actually model the dynamics of the reservoir and it takes an engineer’s intuition to model a more accurate forecast. The machine can get close, though, and that’s where the dynamic and interactive nature of Spotfire is really going to improve the type curve fitting workflow.

In this post, we will walk through the mechanics of calculating vanilla type curves in Spotfire.
You can get the end result off of Exchange.ai. I call our objective ‘vanilla’ because this is the most basic, straightforward method. There’s a lot more you can add to the decline fit to improve its usefulness. Let’s consider those extra toppings… like we could make hot fudge type curves. Who says data science can’t be delicious?

Because we want this to run out of the box in Spotfire, we are going to use base R, meaning we won’t use any packages. There are several things we are doing which would be easier done with an R package, but we don’t want to complicate the user’s experience by having to install packages. Plus, there’s nothing better than implementing things from scratch to make sure you know what you’re doing.


We will assume that you have access to well header and production data. On the well header table, it’s nice to have location, reservoir, well type, and other metadata that will help you grab reasonable well groupings. For production, there needs to be a valid lookup column to relate it back to the well header table and should have your fluid rates (oil, gas, water). The table itself can be daily or monthly. If you’re doing a play assessment, it’s likely you only have monthly data. The granularity of monthly data works well for the type curve workflow as it cleans up a lot of the noise inherent in daily rates. If you are using your own proprietary data then you probably have daily. The method in this post is general to either; it really just changes the units of the results (e.g. BBLs decline per day or per month).

For this post, we will assume you have daily data.

Type Curve Equation

For this workflow, we will use the Arps equation. You’ll see in the code how you can make this more general (or put in your own “secret sauce” type curve equation). Ultimately, we will use R’s nonlinear equation fitting functionality to find the best fit for parameters. Since it’s nonlinear, you can put in pretty much any function to be fit. There are caveats, of course, as it may not be able to find the best fit parameters given the starting conditions and function characteristics.

Calculating a Best Fit Type Curve

Let’s start by getting the data into Spotfire’s TERR environment. This is done by creating a data function and sending in the production data. I’d suggest limiting the production data you send in by marking or filtering, but of course that depends on the workflow you want to develop. In this post, I’ll assume you’re only sending in production you’ve limited by marking. If you have no idea what I’m talking about, just keep reading because this part isn’t necessary for calculating type curves.

For simplicity, we will just walk through putting type curves on the oil stream. Doing the same for gas and water is pretty much an identical procedure. We will assume that the production data table has a well name column, a date column, and an oil rate column. We have twenty minutes, remember!

My input for the calculation will be
The first step is to calculate DaysOn in order to normalize the type curves. We will do this by calculating the minimum production day. We can improve this by calculating the minimum production day which has nonzero oil rate, or even the date of peak oil. To do this we will use the aggregate function.

minDate = aggregate(x=data$PDate, by=list(data$WellName), FUN=min)

We will merge the result back with our production table and subtract the min date minus the production date to get Days On. Technically, you may want to remove down days from the DaysOn calculation. There are some easy ways to do this if you want to clean up “DaysOn”.

data = merge(data, minDate, by="WellName", all.x=TRUE)

Having the well count will be useful for improving the curve fit and the user will need to see this curve back in Spotfire. A simple way to do this is to make a column of 1’s which we will sum.

data$WellCount = 1

Okay, let’s aggregate our oil across wells to get our dataset to run the type curves on.

type.data = aggregate( Oil ~ DaysOn, data=data, FXN=mean )

And let’s add in the well count.

type.data = merge( type.data, aggregate(WellCount ~ DaysOn, data=data, FXN=sum))

Now we are going to predict oil rate using time (basically decline). We will use the Arps function which takes three parameters: initial oil rate (Qi), effective decline (De or a), and decline degradation (b). These three parameters describe the initial production rate, its initial decline, and how decline changes as a function of time. In R, we can use the ‘nls’ function to find a best fit of the parameters.

fit = nls(Oil ~ qi/((1+b*a*DaysOn)^(1/b)), data=type.data, start=list(qi=max(type.data$Oil),a=0.01,b=1),control=nls.control(warnOnly=TRUE))

I’ve provided some reasonable start values for the parameters. If you don’t provide this, nls may not consistently find a good value. We could improve this by first doing a linear exponential fit then using those values as start parameters. I’ve told the function to only warn me if it doesn’t converge. Basically, I still want the values back even if we didn’t get to convergence. The user can just run it again if they don’t like the fit.

You can grab the Spotfire template off of Exchange.ai. In the template, it helps the user define type curves for different regions, then compare the results of the fit curves. Usually you want to compare type curves from different areas (geographically, geologically, or competitively). So, this means you need to store the results of each type curve run if you like the fit. The template provides the functionality for you.

Going Further

Type Curves are a core part of several oil and gas resource assessment workflows. For anyone doing data science in oil and gas, they are a common and useful technique to have in your back pocket. Because time is one of the strongest predictors for performance of a reservoir – due to the drop in pressure of the reservoir by production – any regression technique needs to take time into account. So, whether you like it or not, you have to take type curves into account!

We will be releasing a more sophisticated version of the type curve analysis that handles gas and water and also gives uncertainty bands around the decline. Stay tuned for that.

The Code

Here’s the full code. I’ve added some extra stuff that I didn’t walk through, including how to forecast the model into the future. The end result is less than 50 lines of code.

# We need to convert the date column to a posixct date... just trust me on this one
data$ProdDate = as.POSIXct(data$ProdDate, origin = "1970-01-01", tz="UTC")
minDate = aggregate(x=data$ProdDate, by=list(data$WellName), FUN=min)
colnames(minDate) <- c("WellName","MinDate")
data = merge(data, minDate, by="WellName", all.x=TRUE)

# Calculate Days On
data$MinDate = as.POSIXct(data$MinDate, origin="1970-01-01", tz="UTC")
data$DaysOn = as.numeric(data$ProdDate - data$MinDate, units="days")
data$WellCount = 1

# Bin daysOn into X day groups
cutpoints <- seq(0, max(data$DaysOn, na.rm=TRUE), by=DaysBin)
binned <- cut( data$DaysOn, cutpoints, include.lowest=TRUE, labels=FALSE )
data$DaysOn <- sapply( binned, function(x) { cutpoints[x] } );

# Calculate Mean(Oil), Sum(WellCount) over DaysOn
gdata = aggregate(Oil ~ DaysOn, data=data, FUN=mean)
wcount = aggregate(WellCount ~ DaysOn, data=data, FUN=sum)
gdata = merge(gdata, wcount)

# We are going to extend the DaysOn column to include the number of days to the ForecastYears parameter (e.g. 30)
max.days = max(gdata$DaysOn, na.rm=TRUE)
if(max.days < ForecastYears*365) {
	gdata = merge(gdata, data.frame(DaysOn=seq(max.days,ForecastYears*365,DaysBin)), all=TRUE)

mtime = Sys.time()
idx = data$DaysOn <= max.days
fit = nls(Oil ~ qi/((1+b*a*DaysOn)^(1/b)),
			start=list(qi=max(data$Oil,na.rm=TRUE), a=.1, b=1),
gdata$OilPredict = predict(fit, gdata)
Written by Troy Ruths
You'll conquer the present suspiciously fast if you smell of the future....and stink of the past.