Most problems in the scientific world are about understanding different phenomena. We want to learn the characteristics and patterns of the systems we study to be able to preview and predict behavior. As humans, we learn by observing these processes when they happen naturally or with controlled experiments. This might not be an option if we are studying a rare or dangerous event.
I received an interesting request from a user that deserves sharing. The user requested a visualization showing the curve of a normal distribution of data points. Now, just to be clear, a visualization that shows the distribution of data points is a histogram, which looks like this:
The histogram might vary a little bit if you change the number of bins being used, but it always has the continuous value along the X-axis and the (Row Count) on the Y-Axis. However, the user didn’t want to see the bars of the histogram, just a curve that represented the histogram, which would look like this:
Normal Distribution Curve
This type of visualization is simple and easy to create in Spotfire using the following steps.
Creating the Visualization
- Add a bar chart
- Configure the X-Axis with the continuous value and the Y-Axis with (Row Count)
- On the X-Axis, click the down arrow on the axis selector and make sure the “Auto-Bin” box is checked.
- If needed, right click on the axis selector and choose “Number of Bins” to set the desired number of bins.
- In the legend, click on the color circle and color the bars the same color as the background (probably white).
- Go to Properties > Lines & Curves > Add > Gaussian Curve fit
BAM! Done! The Gaussian Curve fit is the normal distribution and represents the histogram as a curve. If you combine the curve and the histogram, it looks like this:
In the end, Spotfire had the functionality to quickly and easily meet the user’s needs!
Guest Spotfire blogger residing in Whitefish, MT. Working for SM Energy’s Advanced Analytics and Emerging Technology team!