Category: Production Engineering

How to Add Lines to a Probit Plot with IronPython

A few weeks ago, I wrote a post detailing how to create a multiple variable probit plot.  This post improved upon an older post on creating a single variable probit plot.  Part of those instructions included adding several “supplemental” lines via Lines and Curves like P10, P90, and the Median.  This is actually the most time-consuming part of the process.  Each line must be added one by one.

Today, while reviewing my instructions, I realized I had to do better. I know this can be done with IronPython! A quick Google search pulled up this TIBCO community post that I was able to use as a guide. I modified that script to work for my probit plot use case. Now, I have a piece of code that will add all of those lines and is easily modifiable and scalable.

The Code

Here is what the code looks like in my DXP.  I made the following modifications from TIBCO’s original:

  1. Changed BarChart to ScatterPlot to suit my visualization
  2. Modified the expressions from an average to the Percentile, P10, P90, and Median.
  3. Added code for vertical lines.

Code for Copy & Paste

from Spotfire.Dxp.Application.Visuals import *

scatterPlot = sp.As[ScatterPlot]()

#Add Horizontal Straight Line
horizontalLine1 = scatterPlot.FittingModels.AddHorizontalLine(‘P90([Y])’)
horizontalLine2 = scatterPlot.FittingModels.AddHorizontalLine(‘P10([Y])’)
horizontalLine3 = scatterPlot.FittingModels.AddHorizontalLine(‘Median([Y])’)
horizontalLine4 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],20)’)
horizontalLine5 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],30)’)
horizontalLine6 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],40)’)
horizontalLine7 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],60)’)
horizontalLine8 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],70)’)
horizontalLine9 = scatterPlot.FittingModels.AddHorizontalLine(‘Percentile([Y],80)’)

#Add Vertical Straight Line
verticalLine1 = scatterPlot.FittingModels.AddVerticalLine(’10’)
verticalLine2 = scatterPlot.FittingModels.AddVerticalLine(‘100’)
verticalLine3 = scatterPlot.FittingModels.AddVerticalLine(‘1000’)

Detailed Steps

  1. Add a Text Area to the page, right-click, select Edit HTML.
  2. Click the Add Action Control button.
  3. Name the button.
  4. Click the Script button.
  5. Click the New button.
  6. Name the script.
  7. Copy and paste code.  Modify to suit.
  8. Add a parameter called “sp” and connect it to your visualization.
  9. Run script to test. Click OK to close on script window.
  10. Modify the HTML as shown to hide the button.  You don’t want to click it again.

Caveats

  1. Once you run the script, it does not need to be run again.  When you clicked Run Script the first time, 13 lines were created.  Clicking again will create another 13 lines.  I made this mistake when testing.  Then, I had to delete a ton of lines one by one! (Please upvote my Idea to allow users to delete more than one line at a time).
  2. The script creates the lines, but you still have to edit them one by one.  This might also be possible with IronPython, but I haven’t dug that far yet.
  3. If you copy and paste from my code snippet above, you’ll need to replace the quotes.  Spotfire won’t recognize them correctly from copy and paste.

This should make setting up probit plots just a little bit faster.  You can also modify this code any time you want to add multiple lines to a different visualization or another probit plot.

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

How to Build a Multiple Variable Probit Plot

A few years ago, I wrote a post on how to add a probit plot to a Spotfire project.  I wrote the post early on in my blogging days, and going back to it was a little painful.  It violates several of my “blogging rules” developed over the years.  For example, I should have simplified the example.  I wrote it, and it was even hard for me to reread.  Blog and learn.  Blog and learn.

Anyway, early in the post, I note, ” It is possible to create a single probit plot with multiple variables, but that requires some data wrangling and is not included in this set of instructions.”  Well, these days all my probit plots contain multiple variables.  A manager recently asked me to write up a set of instructions for this common task.  So, here you go.

How to Build a Multiple Variable Probit Plot in Spotfire

Desired Output

First, what exactly are we trying to create?  Here is an example.  You’ll be familiar with the log scale, and as you can see there are multiple lines on the plot for different variables.

High-Level Steps to Desired Output

Creating the desired output is a two-step process.

  1. Unpivot data table
  2. Create probit plot (scatter plot)

Unpivot Data Table

Presumably, the user has a data table that looks like the one shown below.  Each variable is it’s own column.

The first step to creating a multiple variable probit is to unpivot this data with a transformation. The end result will look like the example below.  Columns are transformed into rows.

 

The column names Measure and Value can be changed to names the user finds appropriate. The table will be narrower and taller.

Follow the steps below…

  1. Go to Edit – Transformations.
  2. Select Unpivot from the drop-down menu of transformations.
  3. Move all columns that are staying the same to “Columns to pass through”.
  4. Move all other columns, the columns that are being transformed from columns to rows, to “Columns to transform”.
  5. Name the two new columns and make sure the data types are correct. Measure should be string and Value should be real or integer.
  6. Click OK.

With the data taken care of, now create the probit plot.

Create Probit Plot (scatter plot)

High-Level Steps

Creating the plot is actually more time consuming than wrangling the data.  Adding the secondary lines takes the most time and is optional.

  1. Create a basic plot
  2. Configure the visualization
  3. Format the x axis
  4. Add straight line fit
  5. Add secondary lines (optional)
  6. Filter to appropriate content

Notes:

  1. There are no calculated columns, only a custom expression written on the axis of the scatter plot. Because the expression is written on the axis of the visualization, the calculations will update with filtering.
  2. Filtering on the Measure column will control which variables appear in the plot.
  3. The 90th percentile in Spotfire is equivalent to P10. The 10th percentile in Spotfire is equivalent to P90.

Follow the steps below…

  1. Create basic plot
    1. Add a scatter plot to the page
    2. Set the data table to the “unpivoted” table
  2. Configure the visualization
    1. Place the Value column on the x-axis of the visualization
    2. Right-click on the y-axis of the visualization and select Custom Expression
    3. Enter the following expression — NormInv((Rank([Value],”desc”,[Measure]) – 0.5) / Count([Value]) OVER ([Measure])) as [Cum Probability]
    4. Set the Color by selector to the Measure column
  3. Format the x-axis
    1. Right-click on the visualization, select Properties. Go to x-axis menu.
    2. Set the Min and Max range as shown below in Figure 1.
  4. Add straight line fit
    1. Right-click on the visualization, select Properties. Go to the Lines & Curves menu.
    2. Click the Add button to add a Horizontal Line, Straight Line.
    3. Click OK at the next dialog box.
    4. Click the One per Color checkbox as shown below in Figure 2.
  5. Add secondary lines (see Figure 3 below for example)
    1. Horizontal Lines (P10, P50, P90, etc)
      1. If you are still in the Lines & Curves menu, click the Add button to add a Horizontal Line, Straight Line.
      2. To add P10, P50, and P90, select the Aggregated Value radio button as shown below in Figure 3.
        1. For P10, select 90th
        2. For P50, select Median.
        3. For P90, select 10th
      3. For all other values, select the Custom expression radio button as shown in below in Figure 4. Enter this expression — Percentile([Y], 30) and modify.  For P70, the value is 30.  For P60, the value in the expression should be 40, and so on.
      4. Format the line color, weight, format, and label as desired using the options circled in Figure 5 shown below.
    2. Vertical Lines
      1. The example plot shown has vertical lines at 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, and 10000. Each one must be added individually.
      2. If you are still in the Lines & Curves menu, click the Add button to add Vertical Line, Straight Line.
      3. To the line, select the Fixed Value radio button and enter the value as shown below in Figure 6.
      4. Format the line color, weight, format, and label as desired using the options circled in Figure 5 shown below.

Reference Figures

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

And now you have a multiple variable probit plot!

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

Spark lines can show the shape of highly variable data in a small amount of space

In Spotfire Graphical Table visualization, the use of sparklines is a fantastic way to quickly visualize our data in table format. But, what if we have highly variable data in which it would be better to use a logarithmic scale on the Y-axes? Note, there is no option for using a logarithmic scale on the spark line axes visualizations. We have two options here: to use multiple scales or write a custom expression with multiple scales.

One of the main uses of sparklines is to show the “shape” of our data. If our data range is less variable, then a single arithmetic scale for all sparkline axes is fine. However, in the case below we need to use a different arithmetic scale for each spark line in the column to honor the high variability of the data.

Go to Properties of Graphical Table > Axes and select spark line column as seen below. Now select Settings button for that spark line column.

Then, select Axes and change radio button under “Y-axis scale” from “One scale for all sparklines in this column” to “Multiple scales.” We do the same for all spark line columns with highly variable data we want to “compare,” as seen in the second spark line column in the Graphical Table below.

Note the huge improvement in being able to see the “shape” of our data with “Multiple scales” selected, compared to the first visualization above.

Next, if we want a Log or Logarithmic scale, we can easily write a custom expression as seen below by right mouse clicking of Y-axis name in Sparkline Settings.

Insert Log function Custom Expression, then hit Okay.

Compare final Log Scale Graphical Table below to previous two arithmetic Tables above. Note visualization below has “Multiple scales.”

Finally, if we use “One scale for all sparklines in this column” instead of “Multiple scales” the results may not show enough differentiation especially if you have extreme outliers. Compare this last log image with our first arithmetic one.

Paul is a geoscientist specializing in data analytics and visualizing a lot of data in a small amount of space. As Edward Tufte says, “There is no such thing as data overload, just a failure of design.”

Including Formation Tops in Well Log Visualization

It is quite easy to include formation tops in the Ruths.ai Well Log Visualization. The neatest way to do that is to have a data table that contains the formation top depth for each well contained in the data table that has the well log data. In its most basic form, the formation tops data table should contain at least 3 columns: Formation Name, Top Depth and Well Name. Here is a video of how to add formation tops to Ruths.ai Well Log Visualization:

Read More

Theodore Etukuyo holds a Bachelor’s Degree in Mathematics and an Associate Degree in Petroleum Engineering Technology.

One-Stop Tool for Viewing Subsurface and Well Trajectories in 3D

If you are a petroleum engineer and you have Spotfire installed on your computer, you’re further ahead than you realize. Sound strange? Here’s the hint:

Spotfire can be your tool for:

  • visualizing and analyzing 3D subsurface map and
  • diagramming well trajectory

And you can do all that in one environment.

Now, I’m guessing you thought that those analyses could only be done with some high-end expensive software from say Schlumberger, Halliburton, IHS, or Baker Hughes; not anymore.

Read More

Theodore Etukuyo holds a Bachelor’s Degree in Mathematics and an Associate Degree in Petroleum Engineering Technology.

Comparing Average with Values in Same Plot in Spotfire

A common task for an analyst is to plot averaged values in the same chart against quantities of compared variables in order to show the deviation. For instance, a visual representation of salaries of a certain job function in 3 US cities in the last year, can include the US national average to inform viewers of the departure of each city salary from the national average.

Read More

Theodore Etukuyo holds a Bachelor’s Degree in Mathematics and an Associate Degree in Petroleum Engineering Technology.

Well Deliverability (IPR/TPR) using Spotfire

The deliverability of a system is its ability to deliver gas as a function of pressure. Ruths.ai Well Deliverability tool is developed to assist oilfield operators in determining the flow rates of gas-drive wells using inflow performance relationship (IPR) and tubing performance relationship (TPR) of reservoir, wellbore and production data.

Read More

Theodore Etukuyo holds a Bachelor’s Degree in Mathematics and an Associate Degree in Petroleum Engineering Technology.