Part 7 — Decomposing Visualizations and Data Limiting

This is the final blog post in my 7 part series on decomposing Spotfire projects.  This series is one of the longest I’ve ever written.  Two parts in, I considered writing an ebook instead of a blog series!  This week’s post focuses on decomposing visualizations and data limiting.  If you are new to the series, here are links to the other posts.

As usual, each post in the series will break down into four sections.
  1. Quick and Dirty (Q&D)
  2. Extended Version
  3. Documentation
  4. Room for Improvement
First, the Q&D explains what to look for to get a general idea of what is going on. Then, the Extended Version presents a more complex picture. Documentation provides examples of how to document if necessary. Lastly, Room for Improvement looks at making the project better. Before diving in, I would like to provide context on two subjects — the potential complexity of visualizations and the breadth of data limiting.  Let’s start with the potential complexity of visualizations.

Understanding the Potential Complexity of Visualizations

 One hand, this complexity is awesome.  It allows for a ton of customization in visualizations.  But, it’s a double-edged sword when inheriting a project.  What do I mean exactly?  Custom expressions can be added to any column selector. And, column selectors are everywhere.  They exist in every visualization properties menu.  Thus, without going thru every menu, right-clicking, and selecting Custom Expression, it’s almost impossible to know where they are used.  The same is true for property controls.  You can also ‘Set from Property’ against any column selector.

This is why, I I take screenshots or save copies of DXPs that I am modifying.   From a decomposition standpoint, this is problematic.  There’s not a lot you can do.  Just know what’s possible.  Next, I want to make sure the reader is aware of all the ways in which visualization can be limited.  So, let’s talk about data limiting.

Data Limiting

Data limiting is actually quite a large topic.  I’ve wanted to write comprehensively about it for a while and have the first post in a series drafted.  Thus, this post will stay high level.  That future series will go into greater detail.  The questions we are asking and answering right now are — In what ways can a visualization be limited?  Where are all the possible places you might find data limiting?  Here’s the summary.
  1. Filtering with the filter or data panel
  2. Filtering schemes (Visualization Properties — Data menu — Limit data using filtering section)
  3. Details visualizations or limiting with marking (Visualization Properties — Data menu — Limiting with Marking section)
  4. Limiting with expression  (Visualization Properties — Data menu — Limiting with Expression)
  5. Show/hide rules  (Visualization Properties — Show/Hide Items menu)
  6. Subsets (Visualization Properties — Data menu — Subsets menu)
  7. Relations (Edit menu — Data Table Properties — Relations tab)

You might be wondering why I threw relations in there.  Relations integrate filtering across tables.  I’ve had enough users have problems with it that I thought it worth mentioning separately.

Limiting with 2 – 7 shown here.

Okay, now let’s get into the quick and dirty.

 Quick & Dirty (Q&D)

 Here are the first set of questions you want to ask and answer about visualizations.
  1. Do all of the visualizations work? Do you see any obvious errors?
  2. Is data limited and if so, where or how?
  3. Did the developer include add-ons like lines and curves?

Do all of the visualizations work? Do you see any obvious errors?

Here is an example of a visualization with an error.  A column used in the visualization can’t be found.  When you see these errors, the first and most obvious place to look are the x and y-axis. But, don’t forget columns can be used in custom expressions anywhere in the Visualization Properties menus.  They can be a bit difficult to find.  If the problem isn’t on the axis, start with the Data menu and work your way down the menu list.

Is data limited and if so, where or how?

The data limiting section above explains where to look for data limiting.  Without checking every single menu location, you can also get good information from the legend as shown in this example.

You may need to turn on data limiting and show hide by right-clicking in the white space of the legend.

Did the developer include add-ons like lines and curves?

 Lines and curves may not be super high on the priority list, but it is a good idea to know if they are used.  If the developer included a label, they will be easy to identify as shown in this example below.  They are also identifiable by different formats.  You can only customize the format of lines added via the Lines and Curves menu.  In a generic line chart, all lines will be solid.  

Extended Version

If you want to dig deeper, ask and answer these questions.

  1. Is the visualization controlled by property controls?
  2. Did the developer write expressions on an axis of visualization?
  3. Did the developer build custom expressions into visualizations?

Is the visualization controlled by property controls?

As mentioned above, property controls can be attached to any column selector.  This makes it difficult to find everything they control.  However, I want to show you a little indicator you might not have noticed.  In the screenshot below, the exact same chart is duplicated.  The top chart’s Line by variable is controlled with a property control.  The bottom chart’s Line by variable is not.  The subtle difference is the presence of the down arrow and plus sign.  When property controls are attached, these are no longer options.  Keep an eye out for this.

Did the developer write expressions on an axis of visualizations? Did the developer build custom expressions into visualizations?

Both of these questions might be hard to answer without right-clicking and looking for a custom expression.  However, there are clues.  In the example below, the developer has written an expression on the y-axis.  I know this because the [Axis.] syntax used. Unfortunately, if the developer used the “As” keyword to rename the expression, you will only see the given name.

To learn more about writing expressions on the axis, check out this link.  The post is a bit old, so I apologize if it’s hard to read.  I was new to blogging when I wrote it.

Documentation

 I am guilty of not documenting my visualizations.  As I write, I realize that I should.  The application doesn’t natively allow tracing of data limiting, custom expressions, or property control usage.  Thus, it’s up to the developer to leave some breadcrumbs.  It might be a good idea to keep lists of ….
  1. Property control connections
  2. Custom expression locations
  3. Data limiting in visualizations

You can also create rules for the project, such as…

  1. Always show data limiting in the legend when applicable.
  2. Indicate property controls define a column selector with an asterisk as shown below.
  3. Use a similar convention for custom expressions in the naming.

 

Next, let’s talk about making the project better.

Room for Improvement

  1. Does the vis tell the story? What are the questions you are trying to ask and answer? Does the project flow well?
  2. Would zoom sliders make the data easier to consume?
  3. Are the fonts and sizing easy to read?
  4. Does the user need to see everything on the page?

Does the visualization tell the story? What are the questions you are trying to ask and answer? Does it flow well?

Spotfire projects always seem to start out nice and clean and then slowly morph into a bit of a mess.  It’s easy to clutter up visualizations by adding….

  • Big text areas that aren’t using HTML and CSS.
  • Too many visualizations on the page.
  • Too many visualizations attempting to answer the same question.
  • Visualizations that don’t answer the question clearly.

One of the best ways to improve a project is to clearly define the business questions the visualizations are supposed to answer.  Then, build around the order of those questions and other potential questions that might arise while working thru it.

Would zoom sliders make the data easier to consume?

When I first started using Spotfire, zoom sliders were one of my favorite features. In Excel, I had to duplicate bar charts to make up for a jam-packed x-axis.  Add them to any x or y-axis by right-clicking on the visualization and choosing from among the options in Visualization Features.  To take the analysis to the next level, incorporate this IronPython script into a button to easily reset zoom sliders.

Are the fonts and sizing easy to read?

The default font sizing may not work with your monitors and the content shown.  There are three ways to modify.

  1. Modify every visualization (not the best option) in Visualization Properties — Fonts menu.
  2. Modify the theme, which controls fonts and sizing across the analysis.
  3. Use HTML and CSS to customize the fonts.
Click the black and white button in the toolbar to access themes. Then, review all of the themes menus for font and sizing customization in different places.
Example of fonts in visualization properties.

Does the user need to see everything on the page?

Visualizations can get overcrowded, but you may hide many of the “standard” items on a plot.  Simply right-click on the visualization, select Visualization Features, and then deselect as desired.  It would be nice if there was a way to incorporate this across all visualizations, but this feature isn’t available yet.  More specifically, if you are worried about users changing visualizations, you can hide the axis selectors.  New users won’t know how to turn them back on.

 

Conclusion

Whew!  That was a seriously long online discussion.  I hope you found the content useful.  Next up, I have a series coming out on data limiting and another intermittent series that I’m calling “Simple R”.  Until then, Happy Holidays!

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

4 thoughts on “Part 7 — Decomposing Visualizations and Data Limiting

  1. Pingback: Spotfire Best Practices - Data Shop Talk

  2. brian doe Reply

    Hi Julie, you do such a great job on these posts! Super useful, thanks!
    One suggestion i have is to mention in the Data Limiting section (or future post) the technique of Data limiting through Inner joins (where the inner join is dynamic based on a property control) or similarly data limiting through a R script that simply duplicates a table, based on the input table being filtered. y<-x where x and y are data tables, and the x table is limited by filtering (or marking). I use both those techniques a lot to create a 2nd table which is dynamically limited based on filtering/marking in the original table.
    I feel this makes over calculations much easier via calculated columns rather then in-viz custom expressions.

      • brian doe Reply

        Great!
        I thought i saw that Terr method somewhere on here… I just think its SO handy as “Data Limiting Technique” its worth listing!

        The dynamic inner join I’m referring to is that the inner Join “key” can be modified based on a document property. So one example is to create a boolean column that is dynamic based on a document property ([Fruit]=DocumentProperty(“Fruit”) so that it will be true only for the selected fruit).

        Then i innerjoin (add columns) to that table from a second “helper” table (1 row x2 columns {‘True’,’Joined’}) which only has “True” in the innerjoin key column and just shows “Joined” in the added column.

        We use that method rather then TERR if we suspect it will be faster, or don’t want to depend on the stats server.
        Its very similar to the row level security method (document property in that case is the username to inner join to the table).
        It basically “deletes’ the unwanted rows via the inner join. Thankfully (magically?), the “deleted” rows come back when you change the doc property from “apples” to “oranges”.

Leave a Reply

Your email address will not be published. Required fields are marked *