Category: Data Science

Data Science Toolkit Improvements

This week, I was able to test out the latest and greatest changes to the Data Science Toolkit.  New options and features allow users to easily split test and training data sets prior to model building, as all good data scientists should!  This new functionality speeds up your analysis by making model build and evaluation faster and more efficient.  I worked up this video to demonstrate.

 Data Science Toolkit for Spotfire

The Data Science Toolkit brings the power of advanced data science to Spotfire. designed it with simplicity and efficiency in mind to support a wide range of analytics applications. This extension is coupled with comprehensive training that provides both beginner and experienced users a strong foothold in data science analysis.  The Data Science Toolkit is available to Premium subscribers.  Once deployed on your Spotfire server, quickly and easily access the toolkit via the Tools menu as shown below.  Find out more, including videos, at this link.

Data Science Toolkit menu

Please feel free to reach out to me or anyone else on the team to learn more about this amazing product.  We love to talk about it!


Real Estate Secrets: Hidden Trend Visualization

Everyone who has ever owned or lived in a house knows at least a little bit about the whims of the real estate market. Big houses cost more, neighborhood matters, proximity to basic services is great, age and style are important in some markets, you name it. But what is it that matters the most? This is a question that visualization can help us answer.

Read More

Using Support Vector Machines in Spotfire

(Image Source:

Support Vector Machines (SVMs) is one of the most popular and most widely used machine learning algorithms today. It is robust and allows us to tackle both classification and regression problems. In general, SVMs can be relatively easy to use, have good generalization performance, and often do not require much tuning. Follow this link for further information regarding support vector machines. To help illustrate the power of SVMs, we thought it would be useful to go through an example using a custom template we have created for SVMs.

Read More

Spotfire Troubleshooting — Fixing a Bug in Spotfire Logistic Regression Modeling

Two weeks ago, I published a Linear and Logistic Regression template on that can be found here.  When I built the template, my process was as follows:

  1. Add test and training data sets
  2. Build model on training data set
  3. Insert predicted column based on model in test data set

When following this process for the logistic regression model (a classification model), it inserts two columns of data — ProbPrediction and ClassPrediction.  These two columns give a prediction and a probability.  I noticed that some records contained a value for the ClassPrediction but not the ProbPrediction, which seemed odd.  This happened in records where one or more of my predictor columns were null, in which case, neither column should have been populated.

It turns out that this is a bug that can be fixed with the steps below.

  1. Go to the Tools menu and select TERR Tools
  2. Click the Launch TERR Console button
  3. Type getOption(“repos”)
  4. Type install.packages(“SpotfireStats”)
  5. Type q() to exit the program
  6. Close the program and relaunch

See below for a screen shot of the console.

After I relaunched Spotfire and reran the model, I saw consistent population of the ProbPrediction and ClassPrediction columns.  If you have any questions, feel free to contact me at

5 Simple Prep Steps for Multivariate Analyses

Trust me, I get excited about a new data set just like anybody. I just want to tear straight to the good stuff and find those hidden correlations and to use all my fancy tests and methods. But before you get to that point, it’s important to run important preparation on your data – each a critical “Prep Step” that can save you time, rework, and wrong conclusions down the line.

Read More

Spotfire Data Functions — TERR Basics

  • Have you experienced difficulty trying to implement data functions or TERR code from blog posts?
  • Do you find that blog post frequently assume you know how data functions work and simple but important steps are missing?
  • Can you follow blog post steps but don’t know why a step is required and thus can’t implement it in a different scenario?
  • Do you feel like you are lacking in a foundational or basic understanding of how TERR works in Spotfire?

Read More