- Have you ever experienced difficulty deciphering the granularity of an unknown data set?
- Have you had trouble figuring out what defines a record or makes a record unique?
When working with an unfamiliar data set, one of the first things you should seek to understand is what makes a record unique. What defines the granularity of the table? This can be really hard to do. You might try looking at unique values in the filter panel, but once you realize uniqueness requires more than one column to define, the filter panel becomes a dull tool for the job. There is, however, a quick hack that you can use in Edit, Data Table Properties. There is a section in the General tab called ‘Key columns for linked data’. You can use the following menus to help define the granularity of a table.
Clicking on the edit button brings you to a screen that looks like the screen shot below. There will be 2 – 3 drop down menu options. Sometimes the option for ‘Columns with unique values for all rows (Recommended)’ option is not present. I have a few caveats about the option ‘Columns with unique values for all rows (Recommended)’, but I’ll come back to that shortly.
What this menu will do is allow you to move columns from left to right, defining the columns that make a record unique. For the purpose of example, I am simply going to choose ‘All Columns’ from the drop down to see all columns in the data table. In this example, I am looking at a production data set. I know that I have one record per well per day. If I move ProductionDate and WellName to the right and click ok, Spotfire agrees and I have essentially confirmed what makes a record unique.
If that were not the case, I would see this error message, as I do when I replace WellName with Reservoir. ProductionDate and Reservoir do not create unique records. Therefore, you can use this function in Spotfire to help identify what makes a record unique in a data table.
Before I wrap up, I mentioned a caveat about ‘Columns with unique values for all rows (Recommended)’. As the Help section indicates, any column shown when you select ‘Columns with unique values for all rows (Recommended)’ should have unique values in each record. However, I have experienced that this isn’t always the case. I have copied a screenshot of a data set that has a column called Machine. There is only one value in the data set for Machine, and that does not make the record unique. TIBCO support has informed me that this is a defect and will be remedied in version 7.8 that should be released in January.
Guest Spotfire blogger residing in Whitefish, MT. Working for SM Energy’s Advanced Analytics and Emerging Technology team!