Author: Julie Sebby

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy's Advanced Analytics and Emerging Technology team!

IronPython to Sort the Bars of a Combination Chart

  • Are you running an older version of Spotfire (7.9 or lower)?
  • Would you like to be able to sort the bars of a combination chart in the same way that you can sort the bars in a bar chart?
  • Would you like a little bit more control over the appearance of combination charts?
  • Are you working on IronPython skills?

Read More

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

TERR Errors with tidyr Package

  • Have you had problems with TERR since the rollout of TERR 4.4 in Spotfire 7.11?
  • Are you running into compatibility problems between package versions?
  • Would you like to be able to remove the default package installation and install a specific version of a package?
  • Are you getting the error — Error in .BuiltIn(“on.exit”) : the S language function ‘on.exit’ is not stored as a variable value, and can only be accessed by evaluating source expressions containing ‘on.exit’ statements — when trying to use the tidyr R package?

Read More

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

Spotfire Naming Conventions — Part 1

A few weeks ago, I wrote this post on my biggest regret when building a large project. That regret was not developing and consistently applying a column naming convention. A week or so after publishing, a user asked via comments for best practices on naming conventions. I’ve had something in draft since then, but I’ve struggled to git ‘er done because it’s such a meaty subject. I cleared that hump this week when I saw a data table naming convention that I could elaborate on.  Then the light bulb turned. It’s not workable to develop a single naming convention for Spotfire. Rather, develop a different convention for different Spotfire elements. Now, I can break the subject into manageable bite-sized pieces.

So, I’m going to write a series on naming conventions for the Spotfire “elements” shown below. These are the bits and pieces where I’ve run into difficulty with names.

  • Data Tables
  • Columns
  • Document Properties
  • Data Functions & Scripts
  • Data Connections

This post will cover naming conventions for data tables. In the following weeks, I’ll work thru the other naming conventions. Some content will be formulaic — do this, not that. However, other content will provide considerations for developing your own naming convention.

Lastly, when titling this post, I was hesitant to call this content a best practice. In my experience, naming conventions meld best practice and personal preference. Best practices have been tried and tested by lots of folks.  They are what works for large groups of people.  Personal preference is sometimes not a preference but more the result of working conditions. You know it’s not the best, but it’s the best you can do.

Why a Naming Convention

Now, you might think a naming convention for data tables is overkill. Please reconsider if one of the following might apply.

  • You are building something someone else will have to maintain. That someone else will thank you later if they inherit a project that is easy to understand.
  • You sleep or drink between now and the next time you work on this project (trust me on this one).
  • You are working on a project with another person (it will get messy).

Furthermore, I don’t believe one naming convention can meet all naming needs.

  • Spotfire elements have different “communication requirements”. In other words, you use the name to communicate useful information about a table, column, etc.
    • For tables, it’s helpful to communicate the type of data source, the data source, and the sequence of tables.
    • With columns, you might want to communicate the type of column, where it came from, or how it was created (ex. transformation, data function).
    • For data functions and scripts, you might want to communicate what the script does or what it impacts.
  • Some Spotfire elements have limitations, like column properties, which cannot have spaces.
  • Some Spotfire element names aren’t editable, like data function names.
  • Sometimes column names (more specifically the special characters) impact code.
  • Space is also an important consideration for some elements like data table names (in the legend) and column names (word wrap in tables).

Thus, there are many things to consider when naming things. Now, let’s talk about naming data tables in particular.

Common Pitfalls When Naming Data Tables

Here are some of the problems I run into when naming data tables.

  • The name is too long to see in the legend.
  • The data table name equals the name of the table in the database, which is helpful, but …
    • Underscores are used.
    • The name is in all caps (WHY?!?!?!).
    • Naming doesn’t make sense to users.
  • The data table name doesn’t equal the name of the table in the database and is thus hard to track down.
  • The name isn’t distinguishable from other table names.
  • The name doesn’t communicate useful information, and users must go into the General tab for more information.

Do you know of other common problems? Leave a comment! So now that we know what we don’t like, let’s talk about what we do like and/or what to take into consideration.

Considerations for Data Table Naming Conventions

Here are the items I considered when kicking off a new project.  I spent a painful amount of time renaming tables, but seeing it goes a long way.  Feel free to comment on what you like or don’t like.

Type of data source (information link, data connection) or data source itself (WV stands for the WellView database).

Data table creation (ex. table created by a data function) or versions of a table (ex. Pivoted or unpivoted).

Relations and connectivity of tables  (experimented with this one and rejected it…no screenshot).

Order of table creation.  Spotfire doesn’t let you reorder tables, and very often there is an order to how tables are created.  This can be communicated in the name.

My Naming Convention

Do you remember what I said above about best practice versus personal preference? That really came into play in my final decisions. Here’s what I settled on.

 

  • I wanted the data source type to be clearly identified with square brackets. I experimented with adding the type at the beginning and end of the data table name. It’s better for the legend to put it at the end, but I liked it at the beginning.
  • I included the data source name at the end in square brackets.
  • If data originates from within the project, that is noted instead of the data source type.
  • Pivot and unpivot transformations receive a designation.
  • If there is no transformation, I used the term [From Current Analysis].
  • If a data function or script creates the table, that goes into the square brackets.
  • I despise uppercase and underscores. However, I have a strong preference for matching what’s in the database. I’ve seen far too many similarly named tables in databases. I don’t want to work through the information designer to find the true data source.

At the end of the day, it comes down to individual priorities in conveying information. I could write about this all day, and I have no doubt I will update the post eventually with new learnings.  While working on this, I found YouTube channels dedicated to the subject of naming things Now that is a deep, deep rabbit hole.

Rejected Ideas

As previously indicated, there were some rejects.  Instead of underscores, I tried periods.  I may still go back to periods.  Underscores take up so much space.

I also tried putting the data source type at the end of the name, but I just didn’t like it.

Lastly, I experimented with nomenclature to indicate the presence of relations or integrated filtering. I thought it would be useful, but I hated it so much, I didn’t even take a screenshot.

 

Final Recommendations

My final recommendations are these…

  1. Just start. Create your own naming convention.
  2.  Write it down and keep it handy.
  3.  Review at the start of every new project.
  4.  Modify it if it doesn’t work for you.
  5. Show it to other people, and get their opinions.

Your Reward

If you made it to the end of this post, you are rewarded with this link to an awesome Dilbert cartoon on naming conventions.  I can’t legally put it on this blog post without paying for it, so the link will have to do.

 

Spotfire Version

Content created using Spofire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

How to Remove Months of Zero Production from a Cumulative Calculation

I don’t advertise it much, but I do fill requests for blog content. Last week, a Spotfire user contacted me via LinkedIn and asked the following question about a cumulative calculation…

        Is it possible to create cumulative production plots that automatically exclude months that have zero production?  Oftentimes when we make cum plots manually.  We completely delete months that have zero production. Wells have shut-ins for various reasons, and if included in the dataset, these time periods show a flat line on the cum plot. This makes it difficult to visually compare wells against each other, because some have downtime, and some don’t. So we want to disregard zero months completely, and not count them on the x or y axis. LinkedIn Contact

Spotfire can definitely handle this situation.  A single calculated column using the RANK function and an IF statement will take care of it.  Just so we are all on the same page, let me break down the requirements of this calculation.

Requirements

There are three distinct requirements in this request.

  1. Start production for each well at the same point in time (i.e. normalize the start date).
  2. Calculate the cumulative production for each well.
  3. Remove months of zero production.

Furthermore, here is an example of what the user wants to avoid…

Do you see those straight lines? Those are months of no production. The user does not want to see these straight lines. Straight lines make it hard to compare wells, so we are going to take those out.

Data Set

To develop a solution to this problem, I created a quick and dirty dataset using only 2 wells. I’ve included this so you have column names for reference. Now, let’s get to the solution.

Solution

First, to make sure both wells begin production at the same time, we must create a calculated column to normalize the production date. Of course, there is more than one way to write this type of calculation. My preference is to use the Rank function.

Rank([Production Date],[Well Name])

The expression shown above is saying — Rank the Production Date for each Well Name. It is named “c.Normalized Time”.  This is a good calculation if all we wanted to do was normalize the Production Date. Because we also want to remove months where production is zero, we need to include an IF statement like this…

If([Gas Prod]>0,Rank([Production Date],[FDOP]),null)

This expression returns a null value when there is no Gas Prod. Now that we have all the necessary calculations, let’s move on to the visualization. We will use the Cumulative Sum aggregation to calculate the cumulative gas prod. Note, I have a calculated column in the data table, but I don’t actually need it.

If we stopped here, the visualization would be not quite right. It would look like this one….see the (Empty) at the end of the visualization? Those are the nulls created by the IF statement. We can get rid of those by simply using the filters.

Here is the desired end result. And from my LinkedIn contact…

Awesome, Julie!  Thanks for taking my request.  Looks like it wasn’t too complicated.  This should help a lot of “basic” users like myself.

 

 

Spotfire Version

Content created with Spotfire 7.12.

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

A Different Way to Build Drop-Down Property Controls

  • Does adding new columns to a data table “contaminate” your drop down property controls?
  • Would you prefer to not have to create column properties for drop-down controls in addition to the document property?
  • Would you like to learn a property control hack?

Read More

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

How do you set colors in Spotfire without reverting back to defaults?

  • Would you like to “set” colors to be applied to all visualizations in a DXP?
  • Have you set the colors for each unique value in a column but find the colors get reset to the default?
  • Are you using property controls to change the Color By variable?

Read More

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!

Spotfire Malware Flag

Normally, I lead off with questions intending to help the user decipher whether the post is relevant to them.  In this case, the questions I came up with were almost too comical to take seriously.  Here they are anyway….

  • Are you suspicious that Spotfire is attacking your computer?
  • Has your company’s security team flagged Spotfire temp files?
  • Are you worried malware has been installed on your computer veiled as Spotfire files?

Read More

Guest Spotfire blogger residing in Whitefish, MT.  Working for SM Energy’s Advanced Analytics and Emerging Technology team!