Month: November 2018

Part 5 – Decomposing Calculations

This week’s post (part 5 of 7) focuses on decomposing columns and calculations.  If you are new to the series, here are links to the other posts.

  1. Intro to the Series
  2. Decomposing Data Tables & Connections
  3. Breaking Down Data Functions
  4. Untangling Data Wrangling
  5. Decomposing Document Properties

As usual, each post in the series will be broken down into four sections.

  1. Quick and Dirty (Q&D)
  2. Extended Version
  3. Documentation
  4. Room for Improvement

First, the Q&D explains what to look for to get a general idea of what is going on.  This week, I have combined the Extended Version and Room for Improvement.  Lastly, Documentation provides a few examples of how to document if necessary. 

Quick & Dirty

The quick and dirty on this post is going to be very quick.  The most important thing to understand about calculations is — Do they work?  After that, it’s all additional information and improvement.  There are two easy ways to find out if all calculations are intact.

  1. Warnings upon file opening
  2. Edit > Column Properties > Is Valid

Spotfire will provide warnings in the bottom left-hand side of the screen when the DXP opens indicating issues or problems.  If you see a red exclamation mark when opening the DXP, click on the link, copy the contents to the clipboard and save to review.

This example shows a warning. A red exclamation mark will appear if calculations are broken.

My preferred way to check for broken calculations is to use the Edit > Column Properties menu.  Open this dialog, find the IsValid column, and click on it to sort.  A value of True means the calculation is working.  A value of False means the calculation is broken.  When I say broken, I mean one of the columns or document properties referenced in the calculation is no longer recognized.  Investigate each False column by clicking the Edit button.  Any element not recognized will be red.

Note the value of false indicating a broken column.
This is broken because I deleted the GAS column.

Now, keep reading if you want to go deeper.

Extended Version/Room for Improvement

Once you have confirmed all calculations are working, everything else is really information and/or improvement.  Here are some questions you can ask and answer.

  1. Did the author use a column naming convention? Should you create one?
  2. Did the developer add descriptions to the columns? Should you add some?
  3. Are all of the columns and calculations used? Can you delete/exclude unused calculations or columns?
  4. Did the developer insert calculated columns into other tables? Is it possible to rearchitect?
  5. Is it possible to combine calculations to create a smaller table?

Did the author use a column naming convention? Should you create one?

Naming conventions are great…if you can come up with one and stick to it.  A few weeks ago, I had serious heartburn about a large project I built without a column naming convention.  Most developers don’t because Spotfire doesn’t make it easy.  Naming conventions take time but are well worth the effort.  To find them, simply scroll thru the columns to see if anything jumps out at you.  Developers may employ naming conventions in the following places…

  1. Notations for calculations, like “c.” (a personal favorite as indicated by my screenshots)
  2. Notations for calculation created by transformations “t.”
  3. Units of measure
  4. Data sources
  5. Conventions for tables that might have similar or duplicate columns, such as “c.Master”

Did the developer add descriptions to the columns? Should you add some?

Now, Spotfire does not allow you to embed comments in calculations, as with scripts or data functions.  Why would you need that?  Well, I recently wrote a calculation that was 67 lines of case logic.  It would have been nice to add comments explaining what each line of logic was doing.  However, I do frequently use the column description to explain the details of a calculation or column.  Simply walk down the list of columns looking at each one to see if there is a description.

Are all of the columns and calculations used? Can you delete/exclude unused calculations or columns?

It is also helpful to know if all of the columns are used.  You could delete or exclude (via transformation) unused columns to make the project more user-friendly.  Unfortunately, there is no easy or quick way to figure this out.  Ultimately, you want to know if columns connect to visualizations, custom expressions, calculations, scripts, data functions, or text areas.

Additionally, attempting to delete a calculation will indicate if it is part of another column.  As shown below, Spotfire will warn against deleting it.  It would be a nice improvement on the software if you could see where columns are used in calculations.

Did the developer insert calculated columns into other tables? Is it possible to rearchitect?

I see users do this all the time, and it’s one of my pet peeves.  Avoid this if you can. Why?  First, if the granularity of the tables differs, the data may be wrong.  Second, if a user wants to understand the calculation, they must trace back to the table where the calculation originated.  Third, most likely, the calculation isn’t needed in both tables, which means it’s inefficient. If the developer employed a naming convention, it is easy to find inserted calculated columns.

Here, I have used the nomenclature “c.” to denote the calculated column in the table where they originate.

 

Note: Calculated columns inserted into a table don’t appear as Column Type “Calculated” anymore. Now they are “Imported”, which is why the naming convention is helpful.

 Is it possible to combine calculations to create a smaller table?

When developing a project, it is extremely common to break complex calculations into pieces to see if they work or to troubleshoot.  A good developer will go back and combine them for maximum efficiency.  However, it’s easy to fall into the trap of — It works!  Let’s move on.  There’s no easy way to figure this out other than to just spend time in the project and get to know the calculations.  I recommend undertaking this task after developing a solid understanding of the project.

Documentation

Documentation like this is valuable for two reasons.  First, it allows you to take a comprehensive look at the project.  You get a high-level look, rather than the detailed look of examining one by one. Second, it prevents having to relearn the project.  Check the reference rather than digging thru the project repeatedly.

This matrix explains usage by asking and answering the question — Where is this column used?

 

This matrix shows how calculations connections and relationships.

 

Conclusion

Well, that wraps up this post.  There are only two posts left in the series — Text Areas & Scripting and Visualizations & Data Limiting. Originally, this was a 7 part series, but I will add on at least one more part if not two.  As I’ve written the series, I have also slowly generated a list of things good developers do.  Thus, it makes sense to summarize best practices.  I’ve also inadvertently generated a list of potential improvements to the application.  I’ll write those up, add them to the Ideas portal, and create a summary linking to the Ideas portal that will allow users to upvote those ideas.

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

Part 4 — Decomposing Document Properties

This series has covered decomposing tables and connections, data functions, and data wrangling.  This week, it moves to decomposing document properties.  I have intentionally broken document properties out separately because there are so many different places they can be used.   If you are just tuning in, here are links to the previous posts in the series.

 Intro to the Series

Decomposing Data Tables & Connections

Decomposing Data Functions

Decomposing Data Wrangling

As usual, each post in the series will be broken down into four sections.

  1. Quick and Dirty (Q&D)
  2. The Extended Version
  3. Documentation
  4. Room for Improvement
In previous posts, both the Q&D and Extended Version sections presented 3 – 5 different questions. Q&D questions were simple in nature, while the Extended Version dug deeper.  In this post, the questions are the same for both sections.  The difference between the two is where you will go to answer them.  The Extended Version will involve more digging.  The Documentation section will provide examples of how to document just like the last post.  Lastly, Room for Improvement looks at a project from the standpoint of making it better. So, let’s get started.

Understanding Document Properties

Now, the most important questions to ask and answer about doc properties are these:

  1. Which types of doc properties are being used?
  2. Are all of the doc properties used?
  3. Are any parts of the project broken because document properties were deleted?

Before we get started on answering these questions, I want to lay out a few critical pieces of information on doc properties.

  1. Types of doc properties in Spotfire
  2. Locations where doc properties can appear
  3. Syntax used to reference doc properties

Types of Document Properties

There are 7 types of doc properties available in Spotfire.  I have included an example of each.

  1. Label
  2. Input field
  3. Input field multiple lines
  4. Drop down
  5. List box
  6. List box multi-select
  7. Slider

Locations Where Document Properties Appear

Doc properties can be engaged in the following areas within a DXP.

  1. Text Areas
  2. Data Functions
  3. Calculations
  4. IronPython and JavaScripts
  5. Visualization Properties

Now, a good place to develop an understanding of doc properties is the Edit menu > Document Properties > Properties tab.

This dialog will show all doc properties in the DXP, including default properties created by the application (MaxMissingTimeParts, FiscalYearOffset).  This list will tell you exactly how many doc properties exist in the DXP.  Use it as a checklist while walking thru the project.  Unfortunately, this dialog fails to detail the type of doc property or where it is used.  This is also where you will delete unused document properties if necessary.

Doc Property Syntax

Expressions reference doc properties in 2 ways.  You will see different syntax depending on how the document property is being used or the type of value in the doc.  In the screenshot below, I inserted the same doc property as a value and again as text to demonstrate the difference.

Quick and Dirty

The Q&D section will focus on text areas, data functions, and scripts because it’s very easy to click thru a project and see document properties engaged in these areas.

Right-click on a text area and select Edit HTML. All document properties in a text area (even hidden ones) appear on the right-hand side of the dialog with the type in parenthesis.

JavaScripts also appear in the menu to the right of the text area.  Select each one and click Edit to look for doc property references.  In order to find doc properties used in IronPython scripts, go to Edit > Document Properties > Scripts tab and look at each script individually to find where doc properties are used.  I don’t have a project handy with an example, so no screenshots for this one (SORRY!). I do have an example of a doc property used in a data function.

Go to Edit > Data Function Properties > Edit Parameters and look for document properties in each input or output parameter. This doc property works with an expression in a data function.

Extended Version

Now, for the Extended Version.  It is much more time consuming (although not difficult) to find doc properties used in calculations and visualization properties.  For calculations, go to the Edit > Column Properties > General tab.  Scroll toward the bottom of the list (calculations are appended below imported columns) and keep an eye out for the document property syntax.  There really isn’t a better way to do this.  In theory, you could delete one document property at a time and look for ‘False’ in the IsValid Column Property, but that seems like overkill.  Perform this task one table, one column at a time.

Next, doc properties can run deep in visualization properties.  Anywhere you can right-click and either Set from Property or write a Custom Expression, doc properties can be used.

In this example, a drop-down property control sets the categorical axis. You must right click and look at the expression to know.

This includes most submenus in the properties dialog.  Pay particular attention to data limiting (in the Data menu), Colors, Lines & Curves, and Show/Hide Items.

I will admit that finding all possible references to document properties can be extremely time-consuming.  If in doubt, leave the doc property in and/or save versions of the project as you go.

Documentation

The screenshots below came from a PPT documentation I assembled for a large project with a LOT of document properties.  The text areas, data functions, IronPython scripts, visualization properties, and calculations all contained doc properties.  Documenting usage became necessary to gain a comprehensive understanding.

First, list out all the document properties.
Create a matrix showing where utilized.
Map interaction with data functions. This maps the object name in the code to the doc property name.
List where used in calculated columns.
Note which page includes doc property references and their type.

Room for Improvement

Lastly, here are a few questions to ask and answer that will improve the project.

  1. Would renaming document properties make their purpose easier to understand?
  2. Can you delete unused doc properties?
  3. Would adding document properties improve the flexibility of the project?

Unfortunately, Spotfire does not allow you to rename document properties.  To rename them, you must essentially create a new document property.  I only “rename” document properties when completely rebuilding a project. I recommend removing unused document properties.  If you delete, do a quick walk-thru of the project after each removal to ensure nothing has broken. Lastly, you might be able to improve the flexibility of a project by adding document properties.  For example, if visualizations have show/hide rules or color rules that are currently hardcoded, you could replace the hard coding with a document property reference to make them more dynamic.

Conclusion

Wow, that was a long post.  Now, you can see why I chose to break doc properties out on their own rather than including with another post. I hope you now have a much more comprehensive understanding of where document properties can be used in a DXP.  They are one of the standout features of the application.  If anyone from TIBCO is reading this, it would be great to have functionality within the application to track down usage of document properties.

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

How to Remove Title from DCA Wrangler

Right now there isn’t a way to take the title off the DCA Wrangler visualization. It takes up some space so it would be nice to tighten it up a bit. Here’s a quick IronPython script that will easily unflag the ShowTitle attribute:

Remember, once you write the script, you’ll still need to press OK to make it permanent and go through the whole process of creating a button. Only then can you delete the button and keep it the way you want. This procedure works for any custom visualization.

Finished product!

Technical Director at Ruths.ai