The Most Important Thing You Don’t Know About How Spotfire Builds Tables

  • After you insert columns, the Spotfire Source Information tab describes the columns ignored rather than the columns added. Why is that?
  • Did Spotfire add columns to your project? Are you seeing columns in your project that you don’t recognize?
  • Do you have duplicate columns in a table, and you have no idea how they got there? Have you tried to delete them, and they keep coming back?

That’s a bold title, huh? I feel like I’m playing a game of rock, paper, scissors, and I just threw fire. You can only throw fire once in your life (allegedly). I’ll never be able to use that title again. The title is warranted because I have never encountered a Spotfire user who knows why the things above happen. For the longest time, I asked these questions and came up with nothing for an answer. The answer lies in how Spotfire inserts columns under the hood.
 
Before I get to the answer, I must warn that your Spotfire might look different. The features and functionality I am going to discuss have changed over versions. The change took place somewhere between 7.9 and 7.12 and was not listed in the What’s New in Spotfire Community page. My screenshots come from 7.12.
 

Source Information

Let’s begin in the Edit > Data Table Properties > Source Information tab. The screenshot below shows Source Information for a table with an insert columns operation. In 7.12, Spotfire show you both the columns added and the columns ignored.
 
Previously, this dialog only listed the columns ignored. If you are in 7.9, you will NOT see the columns added. This was maddening! Understanding columns added required the user to sort through Column Properties. The change in table name indicates inserted columns.
 
This may seem like a developmental oversight corrected in later versions. It doesn’t matter, right? WRONG. While they have added a section to show you the columns added, the build under the hood has not changed.
 

How Spotfire Works

When the user inserts columns from one table to another, the application builds a list of columns to ignore and works with that. It does not add. It ignores. That’s why the dialog is what it is (even though users didn’t care about that). Now, to make sure my explanation is easy to follow, let’s look at an example. 
 
The screenshot below shows my production table and header table. Production has 6 columns. Header has 11 columns.  
I want to add header data to my production table. To do so, I will insert Spud Date and FDOP into the production table. When I do that, Spotfire builds a list of columns to ignore (as shown in the screenshot above).  Here is the resulting table.
 
Now, this doesn’t impact the user….UNLESS…unless the data source changes. So, let’s say I update my original HEADER spreadsheet to include 2 additional columns — Well Type and Artifical Lift.  When I do that, it automatically adds those columns to the Production table.  Because these columns are new, they are NOT part of the ignore list. When the join operation happens, they aren’t ignored. They come into the table as you can see here. That’s how the problem happens!! Now, my project is messy. (I hate messy projects).
If that weren’t bad enough, let me show you how this problem compounds itself.  Let’s pretend that after inserting Spud Date and FDOP, I realize I also need Field and Operator.   
To get them in there, I add another join (because I’m not on 7.13 yet, and I can’t edit insert columns).  Not a big deal, right? WRONG.  

Now, Spotfire has two ignore lists and two joins. Because there are two joins, Spotfire adds my new columns — Arfitifical Lift and Well Type — twice. This is where those duplicates come from.  So, not only do I now have two columns in my table that I don’t need, but I have duplicates of them! Now, you understand how this happens.  This can also happen if you insert columns into a table that you’ve used in a join, not just when an information link is updated.

I know you’ve tried to delete these columns, and they are still a problem. How do we fix them?

How to Fix It

The fix is actually pretty simple. Add an Exclude Columns transformation to the Header table. How does this fix the problem? Well, you can modify an Exclude Columns transformation to exclude the new columns you don’t want. BAM!  Problem solved.
 
Now you know a little bit more about how Spotfire works under the hood. Lastly, this problem and solution is common across all data source types.  It doens’t matter if you are working with a spreadsheet, an information link or an Access table.  

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

Leave a Comment

Your email address will not be published. Required fields are marked *