A few weeks ago, I wrote this post on my biggest regret when building a large project. That regret was not developing and consistently applying a column naming convention. A week or so after publishing, a user asked via comments for best practices on naming conventions. I’ve had something in draft since then, but I’ve struggled to git ‘er done because it’s such a meaty subject. I cleared that hump this week when I saw a data table naming convention that I could elaborate on. Then the light bulb turned. It’s not workable to develop a single naming convention for Spotfire. Rather, develop a different convention for different Spotfire elements. Now, I can break the subject into manageable bite-sized pieces.
So, I’m going to write a series on naming conventions for the Spotfire “elements” shown below. These are the bits and pieces where I’ve run into difficulty with names.
- Data Tables
- Document Properties
- Data Functions & Scripts
- Data Connections
This post will cover naming conventions for data tables. In the following weeks, I’ll work thru the other naming conventions. Some content will be formulaic — do this, not that. However, other content will provide considerations for developing your own naming convention.
Lastly, when titling this post, I was hesitant to call this content a best practice. In my experience, naming conventions meld best practice and personal preference. Best practices have been tried and tested by lots of folks. They are what works for large groups of people. Personal preference is sometimes not a preference but more the result of working conditions. You know it’s not the best, but it’s the best you can do.
Why a Naming Convention
Now, you might think a naming convention for data tables is overkill. Please reconsider if one of the following might apply.
- You are building something someone else will have to maintain. That someone else will thank you later if they inherit a project that is easy to understand.
- You sleep or drink between now and the next time you work on this project (trust me on this one).
- You are working on a project with another person (it will get messy).
Furthermore, I don’t believe one naming convention can meet all naming needs.
- Spotfire elements have different “communication requirements”. In other words, you use the name to communicate useful information about a table, column, etc.
- For tables, it’s helpful to communicate the type of data source, the data source, and the sequence of tables.
- With columns, you might want to communicate the type of column, where it came from, or how it was created (ex. transformation, data function).
- For data functions and scripts, you might want to communicate what the script does or what it impacts.
- Some Spotfire elements have limitations, like column properties, which cannot have spaces.
- Some Spotfire element names aren’t editable, like data function names.
- Sometimes column names (more specifically the special characters) impact code.
- Space is also an important consideration for some elements like data table names (in the legend) and column names (word wrap in tables).
Thus, there are many things to consider when naming things. Now, let’s talk about naming data tables in particular.
Common Pitfalls When Naming Data Tables
Here are some of the problems I run into when naming data tables.
- The name is too long to see in the legend.
- The data table name equals the name of the table in the database, which is helpful, but …
- Underscores are used.
- The name is in all caps (WHY?!?!?!).
- Naming doesn’t make sense to users.
- The data table name doesn’t equal the name of the table in the database and is thus hard to track down.
- The name isn’t distinguishable from other table names.
- The name doesn’t communicate useful information, and users must go into the General tab for more information.
Do you know of other common problems? Leave a comment! So now that we know what we don’t like, let’s talk about what we do like and/or what to take into consideration.
Considerations for Data Table Naming Conventions
Here are the items I considered when kicking off a new project. I spent a painful amount of time renaming tables, but seeing it goes a long way. Feel free to comment on what you like or don’t like.
Type of data source (information link, data connection) or data source itself (WV stands for the WellView database).
Data table creation (ex. table created by a data function) or versions of a table (ex. Pivoted or unpivoted).
Relations and connectivity of tables (experimented with this one and rejected it…no screenshot).
My Naming Convention
Do you remember what I said above about best practice versus personal preference? That really came into play in my final decisions. Here’s what I settled on.
- I wanted the data source type to be clearly identified with square brackets. I experimented with adding the type at the beginning and end of the data table name. It’s better for the legend to put it at the end, but I liked it at the beginning.
- I included the data source name at the end in square brackets.
- If data originates from within the project, that is noted instead of the data source type.
- Pivot and unpivot transformations receive a designation.
- If there is no transformation, I used the term [From Current Analysis].
- If a data function or script creates the table, that goes into the square brackets.
- I despise uppercase and underscores. However, I have a strong preference for matching what’s in the database. I’ve seen far too many similarly named tables in databases. I don’t want to work through the information designer to find the true data source.
At the end of the day, it comes down to individual priorities in conveying information. I could write about this all day, and I have no doubt I will update the post eventually with new learnings. While working on this, I found YouTube channels dedicated to the subject of naming things Now that is a deep, deep rabbit hole.
As previously indicated, there were some rejects. Instead of underscores, I tried periods. I may still go back to periods. Underscores take up so much space.
I also tried putting the data source type at the end of the name, but I just didn’t like it.
Lastly, I experimented with nomenclature to indicate the presence of relations or integrated filtering. I thought it would be useful, but I hated it so much, I didn’t even take a screenshot.
My final recommendations are these…
- Just start. Create your own naming convention.
- Write it down and keep it handy.
- Review at the start of every new project.
- Modify it if it doesn’t work for you.
- Show it to other people, and get their opinions.
If you made it to the end of this post, you are rewarded with this link to an awesome Dilbert cartoon on naming conventions. I can’t legally put it on this blog post without paying for it, so the link will have to do.
Content created using Spofire 7.12.
Guest Spotfire blogger residing in Whitefish, MT. Working for SM Energy’s Advanced Analytics and Emerging Technology team!