Category: TERR

Spatial Objects Using TERR

A key part of analytics in the oil and gas industry is evaluating opportunities at different locations. Space is always present when looking for profitable development projects. We usually look at the already in production wells and try to find some spatial trends. To stay competitive, we need to find better ways  to access the  data of different areas and its wells. For instance, we can transform the spatial information to compact objects that store the location and shape of each well and lease. These objects can be feed  to different calculations and analyses as geometries. For Spotfire, it also has some advantages, you can use the feature layers of the map chart. In this case, we can visualize the leases as polygons and wells as lines.

Read More

TERR — Converting strings to date and time

This post explains my struggle to convert strings to Date or Time with TERR.  I recently spent so much time on this that I thought it deserved a blog post.  Here’s the story…

I was recently working on a TERR data function that calls a publicly available API and brings all the data into a table.  I used the as.data.frame function to parse out my row data.  In that function, I used the stringsAsFactors = FALSE argument, and as a result (the desired result), all of my data came back as strings.  This was fine because the API included column metadata with the data type.  As you can see in the script below, I planned on “sapplying” through the metadata with as.POSIXct and as.numeric.  This worked just fine in RStudio, and it also worked for the numeric columns and for the DateTime columns.  However, it did not work for Date and Time columns.  I tried different syntax, functions (as.Date didn’t work either), packages, etc to get it to work and NOTHING!  The struggle was very real.

Script convert strings to Date or Time with TERR

 

Finally, I Googled the right terms and came across a TIBCO knowledge base article with this information….

Spotfire data functions recognize TERR objects of class “POSIXct” as date/time information. As designed, the Spotfire/TERR data function interface for date/time information does the following:

– Converts a Spotfire value or column whose DataType is “Date”, “Time” or “DateTime” into a TERR object of class “POSIXct”.

– Converts a TERR object of class “POSIXct” into a Spotfire value or column with a DataType of “DateTime”, which can then be formatted in Spotfire to display only the date (or to display only the time) if needed.

This interface does not use any other TERR object classes (such as the “Date” class in TERR) to transfer date/time information between Spotfire and TERR.

That told me that all my effort was for naught, and it just wasn’t possible.  I contacted TIBCO just to make sure there wasn’t some other solution out there that the article was not addressing.  In the end, I just used a transformation on the Date and Time columns to change the data type.  I hope that you, dear Reader, find this post before you spend hours on the same small problem.  I did put in an enhancement request.  Fingers crossed.  Please let me know if you have a better method!

 

 

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

Part 5 – Data Function Properties

This is the fifth part of a series on Spotfire Properties.  In previous posts, I discussed Document PropertiesData Table PropertiesColumn Properties, and Data Connection Properties.  This week we’ll take a look at data function properties.  Next week, the series will conclude with Visualization Properties.  Data function properties is a bit of a beefier subject because all data functions can be created out of the data function properties menu.
 

What is a Data Function?

Since I am writing this for the new Spotfire user, let’s start with the basics. What is a data function, and why would you need one? A data function is TERR code written to perform a specific task within Spotfire. What kind of tasks you ask? I’ll come back to that in just a sec. First, I want to define TERR. TERR stands for TIBCO Enterprise Runtime for R. TERR is the Spotfire version of open source R. R is a programming language for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software and data analysis (Wiki). 
As it relates to Spotfire, TERR expands upon the functionality of the application. For example, Exchange.ai is about to post a template that uses TERR to identify duplicate records (link coming soon).  You can use TERR to manipulate data, pass data thru to new tables, and expand upon the statistical and predictive modeling tools that come with the application. For example, Spotfire doesn’t have a Random Forest model, but you can build one with TERR.  The only limits are really your own programming skills. 
 
Now, I’ve already written a blog post on TERR basics, which includes an explanation of all the TERR screens.  Most of what is in that post I would have also covered here.  Therefore, I am going to use this post to expand on the different options in the screens, rather than walking through a single use case.  Lastly, Exchange.ai also has a few templates such as this template and this template to get you started.    
 

Data Function Basics

Let’s review the basics of Data Functions.  Creating data functions is a 3 step process.
  1. Create the script
  2. Create the parameters
  3. Run the script to map the parameters to the data in the DXP
Key concepts to be aware of:
  • The script is the “meat” of the data function. Within the script, you’ll find at least one input and one output parameter.  The simplest R script I’ve ever written is output <- input.  Input is the input parameter, and…yeah, you can finish that sentence I bet.  
  • TERR (and R) are object-oriented languages, which means programmers can create objects within the code, assign values to the objects and then reference the object down the road rather than all the values.  This makes programming easier. In the example above, input and output are both objects.
  • Input and output parameters tell Spotfire what type of object to work with.  The object could be a table, column, document property or another object. 
  • Running the script triggers the dialogs where you will map the parameters to the actual data in the DXP.
  • Data functions can be connected to marking and filtering.  For example, you can pass the results of marking or filtering to a new table.
  • Users may create data functions from scratch in Spotfire, or users may import data functions from the Spotfire library or another file.
  • By default, data functions embed within the analysis. However, users have the ability to save them to the library for reuse or sharing.
 
With that said, here is a very simple script that I’ll reference throughout the post.  This single line of R code that will check for duplicates in a data set using two identifiers that define the granularity of the table. The output is a column called duplicate.

duplicate <- duplicated(data.frame(key1, key2))

duplicated and data.frame are TERR (or R) functions
duplicate, key1, and key2 are objects in the script
key1 and key2 are input parameters
duplicate is the output parameter
 

Data Function Properties Main Screen

This dialog lists all data functions in the DXP.  From this screen you can …
  • Create new data functions
  • Edit existing data functions and their parameters
  • Refresh data functions
  • Delete data functions
  • Save data functions to the library
  • Export a data function
Data Function Properties
 
Related to the buttons above:
  • The term “Register New” can be a bit confusing to new users.  This really means create a new data function.  In the process, you’ll have the option to save it in the library or register it.  
  • Clicking the Edit Script button will let you modify the script or the input and output parameters.
  • Clicking the Edit Parameters button allows you to change the mapping of data from the parameters to the DXP content.
 

Script & Parameters

These three dialogs define the script, input parameters, and output parameters.  Because the TERR Basics post covers them in detail, I want to focus on the different options available to the input and output parameters.  
Data Function PropertiesData Function PropertiesData Function Properties

Input and Output Parameters

When creating input and output parameters, there are three options — table, value, and column.  ‘
  • If your input or output is an entire table, choose Table.  I use this option when I am simply passing a limited data set from my original table to a new table.
  • If your input or output is a single column, choose Column.  The script shown above for identifying duplicates uses a Column output.  The data function creates a column called “duplicate”.
  • If your input is a hard-coded value or a document property, choose Value.

Data Function Properties

Data Function Properties

Run the Data Function

After you have entered the script, input parameters, and output parameters, the next step is clicking the Run button.  If Spotfire asks if you want to save the data function to the library, you can say no.  It will not impact your DXP.  This is simply to give the option to save the data function to the library so others may access it.  As an administrator, I ask users NOT to do this because it clutters up the library.  It is also hard to know what a given data function is for or if it even works.

Input Parameters

Anyway, this is the step in the process where you map the parameters to the content of the DXP.  Let’s tackle the inputs first.  I have intentionally added two unnecessary parameters to demonstrate that the options for input handlers depend on the type of input parameter.  Each input parameter type has different options.

  • For Column type, there are three options — Column, Expression, and None.  The most common input handler is Column, which I have used in data functions that manipulate or calculate based on a specific column of data.
  • For Value type, there are six options — Value, Document property, Data table property, Column property, Expression, and None.  I most frequently use Document property.
  • For Table type, there are three options — Columns, Expression, and None.  You can tell Spotfire to work with a subset of the columns in the table by using the Select Columns button. Alternatively, typing “*” in “Search expression” will use all columns in a table.  It’s not visible in the screenshot shown, but just below the “Search expression” section, you will also find options to connect the contents of the table to marking or filtering.  This is explained in the TERR Basics post.

I do want to note that I have never used the None option in either input or output handlers.  If someone has, please tell me about it in Comments.

Data Function Properties

Data Function Properties

Data Function Properties

 

Output Parameters

Now, for outputs, it is also true that the options presented differ depending on the parameter type.  As you can see, Column, Value, and Table all have different options.

  • The Column and Table Type have the same four options — Data table, Columns, Rows, and None.  Use Data table if you are creating an entirely new table.  Set the type to Columns if the output is a column that should be added to another table.  Use Rows if you are adding rows to a table.
  • In Value Type, there are six options — Data table, Columns, Rows, Document property, Data table property, Column property, and None.  The same advice is true of outputs here as for inputs.

Data Function Properties Data Function PropertiesData Function Properties

 

As I was writing this, I realized that if I were creating a data function that output rows, I’m not sure which type I would use.  The options for adding rows are part of both the Column and Table Type.  Setting up a Column type to insert rows seems counter-intuitive.  I just haven’t had to write this type of data function yet.  If you know, please Comment!

Hopefully, explaining some of the common uses of the different types of input and output parameters will help you better understand TERR function and how to convert R code to TERR.  Thanks!

 

 

 

 

 

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

RStudio Error – “ERROR r error 4”

A few weeks ago, I wrote a post on TERR compatible versions of R and RStudio. At that time, I realized I needed to update my RStudio and R versions. In doing that, I ran into an RStudio error when launching my newly updated RStudio.  Since I ran into this problem, I am sure other users will too.  
RStudio Error
 
This error occurs when RStudio can’t find the installation of R. Fixing it is super simple. Simply hold down the Ctrl key when launching R and a popup will appear that allows you to specify the R installation location.
Fixing Error
Now, I’m unsure what order of operations I followed and/or if the order of operations caused my problem, but I would recommend the following…
  1. Uninstall R
  2. Uninstall RStudio
  3. Install new R
  4. Install new RStudio

Hopefully, your installation goes easier than mine did.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

TERR Scripts to Read/Write to MS Access

Ruths.ai recently published a free template on the Ruths.ai Exchange that reads and writes data from/to MS Access.  Under the covers, you’ll find two property controls and two data functions working with the RODBC package.  Now, we know that templates are good, but being able to replicate the work is better.  Users want to be able to recreate that functionality in their own files, which is why I am writing this post to explain the code and how everything fits together so you can recreate this functionality in your own DXP files.  Before reading any farther, use this link to download a copy of the template and familiarize yourself with how it works.

Read More

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

Compatible Versions of TERR, RStudio and R

  • Do you know how to check which version of TERR is installed on your Spotfire installation?
  • Are you unsure of which versions of R or RStudio to download so that you know it lines up with TERR?
  • Did you have it all figured out but now you’ve upgraded and lost all the answers???

Read More

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

Using the “spTimer” Package to Model Spatio-Temporal Data in R

The “spTimer” package uses three Bayesian models to fit Spatio-Temporal Data. The data may be given at sparse spatial stations, where observations at each station are considered time series. The package can model the residual spatio-temporal variation to measure uncertainty. It also gives flexibility to customize covariance function selection, the hyper-parameters of the prior distributions and the tuning parameters for the implemented MCMC algorithms.

Read More

Using Support Vector Machines in Spotfire

(Image Source: opencv.org)

Support Vector Machines (SVMs) is one of the most popular and most widely used machine learning algorithms today. It is robust and allows us to tackle both classification and regression problems. In general, SVMs can be relatively easy to use, have good generalization performance, and often do not require much tuning. Follow this link for further information regarding support vector machines. To help illustrate the power of SVMs, we thought it would be useful to go through an example using a custom template we have created for SVMs.

Read More

Emanuel holds a Master of Science Degree in Data Science and Analytics from the University of Oklahoma and brings years of experience in production engineering.