Two weeks ago, I published a Linear and Logistic Regression template on Exchange.ai that can be found here. When I built the template, my process was as follows:
When following this process for the logistic regression model (a classification model), it inserts two columns of data — ProbPrediction and ClassPrediction. These two columns give a prediction and a probability. I noticed that some records contained a value for the ClassPrediction but not the ProbPrediction, which seemed odd. This happened in records where one or more of my predictor columns were null, in which case, neither column should have been populated.
It turns out that this is a bug that can be fixed with the steps below.
See below for a screen shot of the console.
After I relaunched Spotfire and reran the model, I saw consistent population of the ProbPrediction and ClassPrediction columns. If you have any questions, feel free to contact me at firstname.lastname@example.org.
Anna Smith is an Engineering Technician at Continental Resources up in Oklahoma. Today she will be sharing her journey creating average lines using TERR.
I had often been asked for average lines on line graphs – seeing the average of a dataset compared to each individual line in that data set. I kept trying to figure it out with just calculated columns and formatting issues, but eventually came to the conclusion that Spotfire just doesn’t give us an easy or clean way to do this. So the idea of using TERR came into play. In my example, we wanted to compare production over time to the average over time for a certain well set – and we want this to be dynamic, i.e., if we change our well set selected, then our calculated average line needs to change. Our TERR code, then, needed to subset each day, calculate an average for that day, and spit out a new value. An important note: the function given at the end of the article that we used requires the input days or months, which means if you have a data set with just dates and production numbers, you need to normalize all those dates back to time zero.
PCA (Principal Component Analysis) is a core data science technique for not only understanding colinearity of independent variables in a dataset, but can provide a reduced dimensional model by rotating your high-D data into lower dimensions. Here’s some quick info on getting PCA in Spotfire. If you want more info on PCA, of course check out Wikipedia.
A common task for an analyst is to plot averaged values in the same chart against quantities of compared variables in order to show the deviation. For instance, a visual representation of salaries of a certain job function in 3 US cities in the last year, can include the US national average to inform viewers of the departure of each city salary from the national average.
The deliverability of a system is its ability to deliver gas as a function of pressure. Ruths.ai Well Deliverability tool is developed to assist oilfield operators in determining the flow rates of gas-drive wells using inflow performance relationship (IPR) and tubing performance relationship (TPR) of reservoir, wellbore and production data.
A common reservoir engineering workflow is identifying production decline model(s) suitable for the production profile of wells or reservoirs; and quite frequently, asset management experts are tasked with calculating values of determining factors, and applying those factors in decline curve equations to establish forecast methodologies applicable to different periods in an asset’s production history.