One of my favorite quotes is this…
There are only two hard things in Computer Science: cache invalidation and naming things.
— Phil Karlton from Martin Fowler
Even though I am not a computer scientist by any stretch of the imagination, I can identify with the difficulty of naming things. You see, I just reached a major milestone in a big project. I’ve spent the last couple of weeks really getting to know the data, focusing on designing the right architecture, and finally building out the functionality. Now that it’s built, I have one and only one regret.
Before I tell you what that regret is, I want to say having only one regret is a damn fine professional accomplishment. Usually, I have a bullet list at the end entitled — When I rebuild this, change these things. In this case, I think what I built is glorious and fit for purpose, but enough basking in the sunshine.
Project Summary: Combine the same data set from multiple sources, throw in some extra data for funsies, allow users to remove/filter out bad data, and then provide functionality for three different types of analysis. The analysis contains 12 raw tables (as opposed to tables generated by data functions or copies of tables). There is a bit more to it, but that’s the quick and dirty version.
My biggest regret in my latest project (drum roll) is not picking and sticking with a consistent column naming convention throughout the project. Instead, I let the column names come in as they were from each data table. I have regretted this almost every single day of working on the project.
How Did This Happen
So, how did this happen? It happened because changing more than a handful of column names in Spotfire is incredibly time-consuming and tedious. It’s so time-consuming and tedious, I let these excuses win out…
- It’s will be fine (biggest lie ever).
- It will be easy to change later (that’s also a lie, it won’t be).
- I don’t know what to set the naming convention to, so better to do it later than put out something I have to change later (also, not a good reason, sigh).
To be fair, I did have a naming convention for some things, such as calculated columns and transformations, but not for all things, and it is difficult to know at the start of a project all of the things you might need to consider when defining a naming convention. Anyway, 1 – 3 are all just….
Why The Regret
So, what were the consequences of not setting a column naming convention from the start? Well, there are many.
- It’s not always easy to tell what a column of data is and/or where it came from. If there’s one thing I want for this project, it’s for the content to be easy for users to understand. I could have used the naming convention to better explain where data comes from and what the data is.
- It just looks messy. Some column names are all upper case. Others are lower case. Some have underscores. My naming convention for calculations and transformations includes periods. The OCD part of my brain just can’t handle it.
- The project uses data functions, and now I have to pull out spaces and underscores in code. (See note above….it’s hard to know all the things you’ll need to consider in a naming convention at the start).
- I did have to change some column names, so now I have transformations attached to my tables, which isn’t a huge deal in and of itself, but it adds unnecessary complexity.
- After changing those column names, bits and pieces of the project broke. For example, I had to modify all my pivot transformations and columns feeding to and from data functions (i.e. REWORK, which is bad).
How to Rename Columns in Spotfire
There are many different ways to rename columns in Spotfire including….
- In the information designer/information link if you have info links
- In SQL code if you have data connections
- In column properties
- With a transformation
- With a TERR data function
- I assume also with IronPython, although I haven’t tried this route
Each of these methods have pros and cons, some more “cony” than others. None of them are terribly easy, and some of them aren’t even always an option, like changing the column names in the information designer (if you don’t have admin rights) or writing a TERR data function (if you don’t know TERR/R code).
So, what would I do if I had to do it all over again? Well, I would assume I wouldn’t get it right on the first try. Naming things is an incredibly difficult task, and I’m fairly certain more than one go would be required. With that in mind, I would use a easy to edit TERR data function. The other methods, such as modifying the information link and adding in transformations are simply too time-consuming. Let’s be real. If I did it the hard way (modify the info link or use transformations), this is what would happen….I take the time-consuming route and do it once, utter a huge sigh of relief, and then cry into my keyboard when I realized I have to do it again.
I know some TIBCO folks read this. Please, please, please add functionality to the tool that makes it easy to change many column names at one time and not break other parts of the project. It would dramatically improve the software. Us Spotfire developers have to compete with things like company column naming standards that don’t always make sense in every project. The struggle is real. I swear.
Take the time to create and execute a standard column naming convention. It is incredibly time consuming, especially if you want to update it later, but it’s worth the effort.
And now, just because this post was kind of wordy, and you made it to the end….here’s a photo of my dog when he was a puppy and my husband when he was growing a “yeard”. I like ’em both a whole lot. You are welcome.
Guest Spotfire blogger residing in Whitefish, MT. Working for SM Energy’s Advanced Analytics and Emerging Technology team!