Category: Reservoir Engineering

How to Add Lines to a Probit Plot with IronPython

A few weeks ago, I wrote a post detailing how to create a multiple variable probit plot.  This post improved upon an older post on creating a single variable probit plot.  Part of those instructions included adding several “supplemental” lines via Lines and Curves like P10, P90, and the Median.  This is actually the most time-consuming part of the process.  Each line must be added one by one.

Today, while reviewing my instructions, I realized I had to do better. I know this can be done with IronPython! A quick Google search pulled up this TIBCO community post that I was able to use as a guide. I modified that script to work for my probit plot use case. Now, I have a piece of code that will add all of those lines and is easily modifiable and scalable.

The Code

Here is what the code looks like in my DXP.  I made the following modifications from TIBCO’s original:

  1. Changed BarChart to ScatterPlot to suit my visualization
  2. Modified the expressions from an average to the Percentile, P10, P90, and Median.
  3. Added code for vertical lines.

Code for Copy & Paste

from Spotfire.Dxp.Application.Visuals import *

scatterPlot = sp.As[ScatterPlot]()

#Add Horizontal Straight Line
horizontalLine1 = scatterPlot.FittingModels.AddHorizontalLine(‘P90([Y])’)
horizontalLine2 = scatterPlot.FittingModels.AddHorizontalLine(‘P10([Y])’)
horizontalLine3 = scatterPlot.FittingModels.AddHorizontalLine(‘Median([Y])’)
horizontalLine4 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],20)’)
horizontalLine5 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],30)’)
horizontalLine6 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],40)’)
horizontalLine7 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],60)’)
horizontalLine8 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],70)’)
horizontalLine9 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],80)’)

#Add Vertical Straight Line
verticalLine1 = scatterPlot.FittingModels.AddVerticalLine(’10’)
verticalLine2 = scatterPlot.FittingModels.AddVerticalLine(‘100’)
verticalLine3 = scatterPlot.FittingModels.AddVerticalLine(‘1000’)

Detailed Steps

  1. Add a Text Area to the page, right-click, select Edit HTML.
  2. Click the Add Action Control button.
  3. Name the button.
  4. Click the Script button.
  5. Click the New button.
  6. Name the script.
  7. Copy and paste code.  Modify to suit.
  8. Add a parameter called “sp” and connect it to your visualization.
  9. Run script to test. Click OK to close on script window.
  10. Modify the HTML as shown to hide the button.  You don’t want to click it again.


  1. Once you run the script, it does not need to be run again.  When you clicked Run Script the first time, 13 lines were created.  Clicking again will create another 13 lines.  I made this mistake when testing.  Then, I had to delete a ton of lines one by one! (Please upvote my Idea to allow users to delete more than one line at a time).
  2. The script creates the lines, but you still have to edit them one by one.  This might also be possible with IronPython, but I haven’t dug that far yet.
  3. If you copy and paste from my code snippet above, you’ll need to replace the quotes.  Spotfire won’t recognize them correctly from copy and paste.

This should make setting up probit plots just a little bit faster.  You can also modify this code any time you want to add multiple lines to a different visualization or another probit plot.

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

How to Build a Multiple Variable Probit Plot

A few years ago, I wrote a post on how to add a probit plot to a Spotfire project.  I wrote the post early on in my blogging days, and going back to it was a little painful.  It violates several of my “blogging rules” developed over the years.  For example, I should have simplified the example.  I wrote it, and it was even hard for me to reread.  Blog and learn.  Blog and learn.

Anyway, early in the post, I note, ” It is possible to create a single probit plot with multiple variables, but that requires some data wrangling and is not included in this set of instructions.”  Well, these days all my probit plots contain multiple variables.  A manager recently asked me to write up a set of instructions for this common task.  So, here you go.

How to Build a Multiple Variable Probit Plot in Spotfire

Desired Output

First, what exactly are we trying to create?  Here is an example.  You’ll be familiar with the log scale, and as you can see there are multiple lines on the plot for different variables.

High-Level Steps to Desired Output

Creating the desired output is a two-step process.

  1. Unpivot data table
  2. Create probit plot (scatter plot)

Unpivot Data Table

Presumably, the user has a data table that looks like the one shown below.  Each variable is it’s own column.

The first step to creating a multiple variable probit is to unpivot this data with a transformation. The end result will look like the example below.  Columns are transformed into rows.


The column names Measure and Value can be changed to names the user finds appropriate. The table will be narrower and taller.

Follow the steps below…

  1. Go to Edit – Transformations.
  2. Select Unpivot from the drop-down menu of transformations.
  3. Move all columns that are staying the same to “Columns to pass through”.
  4. Move all other columns, the columns that are being transformed from columns to rows, to “Columns to transform”.
  5. Name the two new columns and make sure the data types are correct. Measure should be string and Value should be real or integer.
  6. Click OK.

With the data taken care of, now create the probit plot.

Create Probit Plot (scatter plot)

High-Level Steps

Creating the plot is actually more time consuming than wrangling the data.  Adding the secondary lines takes the most time and is optional.

  1. Create a basic plot
  2. Configure the visualization
  3. Format the x axis
  4. Add straight line fit
  5. Add secondary lines (optional)
  6. Filter to appropriate content


  1. There are no calculated columns, only a custom expression written on the axis of the scatter plot. Because the expression is written on the axis of the visualization, the calculations will update with filtering.
  2. Filtering on the Measure column will control which variables appear in the plot.
  3. The 90th percentile in Spotfire is equivalent to P10. The 10th percentile in Spotfire is equivalent to P90.

Follow the steps below…

  1. Create basic plot
    1. Add a scatter plot to the page
    2. Set the data table to the “unpivoted” table
  2. Configure the visualization
    1. Place the Value column on the x-axis of the visualization
    2. Right-click on the y-axis of the visualization and select Custom Expression
    3. Enter the following expression — NormInv((Rank([Value],”desc”,[Measure]) – 0.5) / Count([Value]) OVER ([Measure])) as [Cum Probability]
    4. Set the Color by selector to the Measure column
  3. Format the x-axis
    1. Right-click on the visualization, select Properties. Go to x-axis menu.
    2. Set the Min and Max range as shown below in Figure 1.
  4. Add straight line fit
    1. Right-click on the visualization, select Properties. Go to the Lines & Curves menu.
    2. Click the Add button to add a Horizontal Line, Straight Line.
    3. Click OK at the next dialog box.
    4. Click the One per Color checkbox as shown below in Figure 2.
  5. Add secondary lines (see Figure 3 below for example)
    1. Horizontal Lines (P10, P50, P90, etc)
      1. If you are still in the Lines & Curves menu, click the Add button to add a Horizontal Line, Straight Line.
      2. To add P10, P50, and P90, select the Aggregated Value radio button as shown below in Figure 3.
        1. For P10, select 90th
        2. For P50, select Median.
        3. For P90, select 10th
      3. For all other values, select the Custom expression radio button as shown in below in Figure 4. Enter this expression — Percentile([Y], 30) and modify.  For P70, the value is 30.  For P60, the value in the expression should be 40, and so on.
      4. Format the line color, weight, format, and label as desired using the options circled in Figure 5 shown below.
    2. Vertical Lines
      1. The example plot shown has vertical lines at 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, and 10000. Each one must be added individually.
      2. If you are still in the Lines & Curves menu, click the Add button to add Vertical Line, Straight Line.
      3. To the line, select the Fixed Value radio button and enter the value as shown below in Figure 6.
      4. Format the line color, weight, format, and label as desired using the options circled in Figure 5 shown below.

Reference Figures

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

And now you have a multiple variable probit plot!

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

Build Type Wells using Selected Wells Method in DCA Wrangler

Reserves evaluators often want to build a Percentile Type Well that represents a certain percentile of the population. It is desired to determine a “P90 Type Well”, “P50 Type Well”, or “P10 Type Well”. When expressed this way evaluator is inherently seeking a type well that results in a percentile EUR. The P90 Type Well will be a representative well where there is a 90% chance that the EUR will be that number or greater. There are two published methods for creating Percentile Type Wells, Time Slice approach and Selected Wells approach.

So, the Percentile Type Wells are expected to provide a forecast that will have an EUR consistent with the target probability. This is not possible with the Time Slice method because that method is based on Initial Productivity (IP) and rates. In other words, Time Slice method makes an implicit assumption of a strong correlation between IP and EUR, whereas in a real-world scenario correlation between IP and EUR has a wide scatter, resulting in a Type Well with an EUR that does not represent the desired percentile. Refer to SPE – 162630 for a more technical discussion on the two methods.

In this blog post we will go through a workflow on how to create Type Wells using the Selected Wells method in DCA Wrangler. We created a template that creates Type Wells using Selected Wells Method, Time Slice Method and using individual well forecasts in the Selected Wells Method.

Following is the workflow for Selected Wells Method:

  1. Select wells in an Area of Interest (AOI)

  2. Create an Auto-Forecast for all the selected wells with desired number of years using DCA wrangler. While doing the Auto Forecast we will use a three-segment approach. The first segment with a constrained b – factor between 1 and 2 (this will take care of the characteristic steep initial decline present in most MFHWs in unconventionals). The second segment with a constrained b – factor between 0 and 1. The third segment for terminal exponential decline.

  3. Generate Well DCA and Well DCA Time results in DCA Wrangler. The Well DCA Time table will have the forecast data for all the wells created using the fitted Arps Model. Remember to refresh these tables every time you change the wells in your AOI.

  4. Next, we will find wells for Target EUR probabilities on an EUR Probit plot generated using all the wells in our AOI. We can enter a threshold value (α) to find wells which have their EUR within the (1 ± α) × EUR at the target probabilities. We can also quickly check the number of wells present within the threshold at each of the target probabilities. Adjust the threshold to get a minimum desired number of wells at each of the target probabilities.

  5. Now we can create Percentile Type Wells for our AOI by running DCA Wrangler in the Type Well mode using the wells we selected in our previous step.

Check out the template and try it with your production data.

Nitin is a Data Scientist at working passionately towards helping companies realize maximum potential of their data. He has experience with machine learning problems in clustering, classification and regression applying ensemble and Bayesian approaches with toolsets from R, Python, and Spotfire. He is currently pursuing his PhD in Petroleum Engineering at Texas A&M University, where his research is focused on applications of machine learning algorithms in petroleum engineering workflows. He enjoys cycling, running and overindulging in statistical blogs in his pastime.

Well Spacing: More Than One Number

With the rise of unconventionals and the increase in wells permeating already tapped fields, Well Spacing has become the hot topic du jour.  But, what is well spacing?  Does it refer simply to how many wells are in one area?  If so, is that area defined by a circle or rectangle or other definition?  What is the make-up of the nearby wells?  Today, we will examine some common terms and approaches to analyzing well spacing.

Two templates that utilize the features that we will discuss in today’s post are Well Spacing Feature Calculations and Horizontal Well Spacing Model.

Key terms we will discuss today:  Voronoi Diagram, circular and rectangular radius, Area of Interest, intersect area, intersecting wells, closest wells, closest distance, aggregated well statistics.

Read More

Jason is a Junior Data Scientist at with a Master’s degree in Predictive Analytics and Data Science from Northwestern University. He has experience with a multitude of machine learning techniques such as Random Forest, Neural Nets, and Hidden Markov Models. With a previous Master’s in Creative Writing, Jason is a fervent believer in the Oxford comma.

Wrangling Data Science in Oil & Gas: Merging MongoDB and Spotfire

Data science in Oil and Gas is central stage as operators work in the new “lower for longer” price environment. Want to see what happens when you solve data science questions with the hottest new database and powerful analytics of Spotfire? Read on to learn about our latest analytics module, the DCA Wrangler. If you want to see it in action, scroll down to watch the video.

Layering Data Science on General Purpose Data & Analytics is a startup focused on energy analytics and technical data science. We are both TIBCO and MongoDB partners, heavily leveraging these two platforms to solve real-world problems revolving around the application of data science at scale and within the enterprise environment. I started our plucky outfit a little under four years ago. We’ve done a lot of neat things with Spotfire including analyzing seismic, and well log data. Here, we’ll look at competitor/production data.

The Document model allows for flexible and powerful encoding of decline curve models.

MongoDB provides a powerful and scalable general purpose database system. TIBCO provides tested and forward thinking general purpose analytics platforms for both streaming and data at rest. They also provide great infrastructure products which isn’t in focus in this blog. provides the domain knowledge and we infuse our proprietary algorithms and data structures for solving common analytics problems into products that leverage the TIBCO and MongoDB platforms.

We believe that these two platforms can be combined to solve innumerable problems in the technical industries represented by our readers. TIBCO provides the analytics and visualization while MongoDB provides the database. This is a powerful marriage for problems involving analytics, single view or IOT.

In this blog, I want to dig into a specific and fundamental problem within oil and gas and how we leveraged TIBCO Spotfire and MongoDB to solve it — namely Autocasting.

What is Autocasting?

Oil reserves denote the amount of crude oil that can be technically recovered at a cost that is financially feasible at the present price of oil. Crude oil resides deep underground and must be extracted using wells and completion techniques. Horizontal wells can stretch two miles within a vertical window the height of most office floors.

For those with E&P experience, I’m going to elide some important details, like using “oil” for “hydrocarbons” and other technical nomenclature.

Because the geology of the subsurface cannot be examined directly, indirect techniques must be used to estimate the size and recoverability of the resource. One important indirect technique is called decline curve analysis (DCA), which is a mathematical model that we fit to historical production data to forecast reserves. DCA is so prevalent in oil and gas that we use it for auditing, booking, competitor analysis, workover screening, company growth and many other important tasks. With the rise of analytics, it has therefore become a central piece in any multi-variate workflow looking to find the key drivers for well and resource performance.

The DCA Wrangler provides fast autocasting and storage of decline curves. Actual data (solid) is modeled using best-fit optimization on mathematical models (dashed line forecast).

At the heart of any resource assessment model is a robust “autocasting” method. Autocasting is the automatic application of DCA to large ensembles of wells, rather than one at a time.
But there’s a problem. Incumbent technologies make the retrieval of decline curves and their parameters very difficult. Decline curve models are complex mathematical forecasts with many components and variation. Retrieving models from a SQL database often requires parsing text expressions. And interacting with many tables within a database.

Further, with the rise of unconventionals, the fundamental workflow of resource assessment through decline curves is being challenged. Spotfire has become a popular tool for revamping and making next generation decline curve analysis solutions.

Autocasting in Action

What I am going to demonstrate is a new autocast workflow that would not be possible without the combined performance and capability of MongoDB and Spotfire. I’ll be demonstrating using our DCA Wrangler product – which is one of over 250 analytics workflows that we provide through a comprehensive subscription.

Its important to note that software exists to decline wells and database their results. People have even declined wells in Spotfire before. What I hope you see in our new product is the step change in performance, ease-of-use, and enablement when you use MongoDB as the backend.

What’s Next?

First, we have a home run solution for decline curves that requires a MongoDB backend. In the near future, more vendor companies will be leveraging Mongo as their backend database.

Second, I hope you see the value in MongoDB for storing and retrieving technical data and analytic results, especially within powerful tools like Spotfire. Plus, how easy it is to set up and use.

And Lastly, I hope you get excited about the other problems that can be solved by marrying TIBCO with MongoDB – imagine using Streambase as your IOT processor and MongoDB as your deposition environment. Or even store models and sensor data within Mongo and use Spotfire to tweak model parameters and co-visualize data.

If you’re interested in learning more about our subscription, get registered today.

Let’s make data great again.

You’ll conquer the present suspiciously fast if you smell of the future….and stink of the past.

Linear Regression, the simplest Machine Learning Model

Linear Regression models are the simplest linear models available in statistical literature. While the assumptions of linearity and normality seem to restrict the practical use of this model, it is surprisingly successful at capturing basic relationships and predicting in most scenarios. The idea behind the model is to fit a line that mimics the relationship between target variables and a combination of predictors (called independent variables). Multiple regression refers to only one target variable and multiple predictors. These models are popular not only for solving the prediction task but also for working as a model selection tools allowing to find the most important predictors and eliminate redundant variables from the analysis.

Read More

4 Tips for GIS in Spotfire

In the oil and gas industry, ArcGIS is king. In terms of capabilities there’s no question that when you see a map lying around a corporate office, it was printed from ArcGIS. Over the years Spotfire has done quite a bit in the way of There’s quite a bit of Those of you handy with Spotfire may know the difficulties in replicating the large graphs. Below I’ve included some tips for those Spotfire developers that have found themselves crossing into that area.

When you get the link from them, be sure that it ends with /MapServer/WMSServer?request=GetCapabilities&service=WMS. This is key, otherwise you will be nosing around the MapServer with no success.

Understand WMS Layers

While Spotfire handles shapefiles, you may find youfself asking how can I create more dynamic maps without all these tables? WMS layers are the answer to that. If your ArcGIS team already has a MapServer, ask them to publish WMS services for the layers that you want. For example if you are asking for leases be sure to recommend the color and outline that you are looking for. WMS layers can be stacked on top of each other much like in ArcGIS, but as far as data goes, the power truly likes in the marker plotting in Spotfire.

Set the Zoom Visibility Controls

If you have multiple layers in the map chart, you will want to control whether some layers should be visible at certain zoom levels. For example, if you have feature layers that encompass larger portions of the United States, they may not be necessary at a well level. Use the zoom visibility feature to reduce the impact of these layers at a higher zoom:

Printing the Big Picture

This was a bit of a personal journey and by that I mean trial and error. My colleague recommended the simplest method by far, export the map chart as a PDF, noting to set the paper size to A0. For us Americans, I recommend the following infographic:

An A0 -sized landscape PDF export is just about what your typical land management executive wants to see for their particular areas. Export to PDF, print on the plotter, done.

Caching for Performance

Be sure to cache these layers as well, performance can be an issue when you are dynamically pulling more than one WMS layer. This also depends upon your latency as well.

That’s all for now! Let me know if you guys have any more advice on the topic!


Technical Director at

Well Deliverability in Spotfire

This is quick video how you can use Spotfire to assist oilfield operators in determining the flow rates of gas-drive wells using inflow performance relationship (IPR) and tubing performance relationship (TPR) of reservoir, wellbore, and production data.

Theodore Etukuyo holds a Bachelor’s Degree in Mathematics and an Associate Degree in Petroleum Engineering Technology.

Including Formation Tops in Well Log Visualization

It is quite easy to include formation tops in the Well Log Visualization. The neatest way to do that is to have a data table that contains the formation top depth for each well contained in the data table that has the well log data. In its most basic form, the formation tops data table should contain at least 3 columns: Formation Name, Top Depth and Well Name. Here is a video of how to add formation tops to Well Log Visualization:

Read More

Theodore Etukuyo holds a Bachelor’s Degree in Mathematics and an Associate Degree in Petroleum Engineering Technology.