Part 1 – Decomposing Tables & Data Sources

Last week, I teed up a series on decomposing Spotfire projects.  This week, we are diving in with Part 1 — Decomposing tables and data sources.  I cover tables and data sources first because they are the foundation for the project. You must understand where data is coming from to understand the rest of the project.

Now, each post in the series will be broken down into four sections.

  1. Quick and Dirty (Q&D)
  2. The Extended Version
  3. Documentation
  4. Room for Improvement

The Q&D explains what to look for to get a general idea of what is going on.  The Extended version works to develop an in-depth knowledge of the project.  Each post will also include suggestions for documentation and improving the project.

Quick Word on Documentation

Before we dive in, I want to say a few words about documentation.  Decomposing Spotfire projects will most likely be messy and complex.  It may start off simple, leading you to believe, you can keep track of how everything is built and structured in your head.  I’ll tell you right now, you can’t.  Any project that has more than 3 tables will get complicated quickly.  It’s also likely you’ll be interrupted or have to work intermittently.  It’s likely, even probable, that you’ll forget what you learned.  I highly recommend documenting as you go.  There is no perfect tool.  I use a combination of PowerPoint and Excel.  Ultimately, I would love to see some documentation functionality available in the application in the future.  Just pick a tool and stick with it through a single project.

Now, let’s dig into tables.

Quick and Dirty

  1. How many tables does the project contain?
  2. Are any tables embedded?
  3. Where do the tables originate from?

How: Go to  Edit > Data Table Properties > Source Information or open the Data Panel.  See specific details in the captions.  Click on one table at a time.

Click on one table at a tab and view the first piece of information in the Source Information tab.

 

Select the table from the drop-down in the top right-hand corner of the panel. Then, click on the Cog. Finally, click Source View in the top right-hand corner of the expanded menu. Source Information is in the information tab.

 

Information on whether a table is linked or embedded is in the General tab.

 

Extended Version

  1. Are any of the data sources on demand?
  2. Are any of the data sources limited by prompts before entering the DXP?
  3. How large are the tables and how long do they take to load?
  4. Are there any data connections in the DXP?

How: Go to  Edit > Data Table Properties > Source Information or open the Data Panel.  See specific details in the captions.  Click on one table at a time.

If a table is either configured with on-demand settings or created with a data function, the Settings button will be available (i.e. not grayed out). Click to see on demand settings or data function parameters.

 

On demand settings are clearly visible from the Data Panel when you click on the first data source.

 

Unfortunately, the only way to know if an information link has prompts and what has been selected is to click Refresh Data With Prompt on all information links. It’s painful, but you’ll only do it once.

How: Go to Help > Support Diagnostic and Logging > Diagnostic Information

This menu shows the size and load time for every table in the DXP. This includes counts of columns and rows. Click the Copy Info button and paste into Excel for easier viewing.

How: Go to Edit > Data Connection Properties 

Go to Edit > Data Connections to see if any data connections exist.

Lastly, another obvious question to ask and answer is — Are all of the data tables being used in the project?  A future post will address this question.

Documentation

As previously noted, documentation can be as simple or as complex as you want it to be.  This is an example of a simple PPT table containing information on the table source and some general properties.  I would also add a column to denote On Demand tables.

This is a PPT slide I put together to summarize the usage of different tables in a DXP.  This slide was part of a larger PPT deck to document a project.

Improvement

Finally, before wrapping up, I want to suggest a few ways to improve the project tables.  Here is a bit of QA for you to think about.

  1. Would renaming tables make them easier to understand?  Take a look at this post on naming data tables. 
  2. Would renaming data connections make them easier to understand?  Most people don’t name their data connections, and if there is more than one, it gets confusing quickly.
  3. If a table is embedded, would it load faster if transitioned to a data set? It is frequently the case that connecting to a data set is faster than connecting to a file.

That’s it for this post.  I hope you found it useful. Next week’s article will look at Data Functions.

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

2 thoughts on “Part 1 – Decomposing Tables & Data Sources

Leave a Comment

Your email address will not be published. Required fields are marked *