Additional Capabilities of @Risk
I use @Risk often so I want to show you another example of just how useful this Excel add-in product can be. The example discussed here has to do with fitting a distribution to a set of data. Before discussing the specific capability of @Risk I will need to digress somewhat using graphs generated from Minitab, a statistical analysis software program (well-designed and very user-friendly). So please bear with me as I first present an analysis of data using Minitab before moving on to the use of @Risk to fit a distribution to a set of data.
During a recent trip I was fortunate, as I often am, in being able to obtain 1.5 years of daily, detailed, wastewater treatment plant operating data from a petrochemical plant as part of a plant audit/site survey my company (as in the company I work for) was conducting. My first “pass” at the data, an exploration of the data, was to generate histograms with a normal distribution fit to the data, using the Minitab, as shown in Figure 1. This histogram shows the average wastewater temperature, using eight sample readings per day, entering a bioreactor. From Figure 1 we can see that the mean temperature was 99.46°F (37.48°C).
Figure 1: Histogram of Wastewater Temperature
It is fairly obvious from Figure 1 that the data is skewed left which results in the normal distribution providing a poor fit to the data. We will correct this poor fit in just a moment, using the “Distribution Fitting” capability of @Risk. Before doing this we will take one more look at the wastewater temperature data.
I have strong views on what the maximum wastewater temperature should be when it enters a biological (activated sludge) treatment system. My experience indicates an upper limit of 95°F (35°C) in the wastewater to the bioreactor but I find there is also disagreement as to the maximum recommended upper limit. Nonetheless, my experience shows that as this temperature [95°F (35°C)] is exceeded, the bacteria become stressed with one major, negative, result being that the MLSS settles poorly in the secondary clarifier because the bacterial population is dispersed. The temperature data is summarized in Table 1.
Table 1: Statistical Summary of Wastewater Temperature Data
A distribution, using @Risk, has been fit to the data as shown in Figure 2. This is one of my favorite features of @Risk and I use it as part of my exploratory data analysis which represents the first step I take when evaluating a new set of data with unknown characteristics. The most appropriate distribution, as determined by @Risk, is an “Extreme Value Minimum” distribution which effectively captures the negative skewness in the data. I purposely adjusted the two vertical sliders in Figure 2 to cover a temperature range from 95°F (35°C) to 110°F (43.33°C), the maximum value in the data. This range includes 83.2% of the data from the entire data set. In other words, 83.2% of the time, during a period of 1.5 years, the wastewater temperature entering the bioreactor exceeded the maximum recommended temperature of 95°F (35°C).
Operation of a bioreactor at these elevated temperatures creates an extremely stressful environment for the bacteria and generally poor performance is the result. This was certainly in evidence at this wastewater system which has a chronic inability to maintain a sufficient mixed liquor suspended solids concentration in the bioreactor due to poor settling in the secondary clarifier and excessive solids carryover.
Figure 2: @Risk Fitted Distribution for Wastewater Temperature Data
There is a point worth noting in regards to the importance of fitting a proper distribution. In Figure 2, with the extreme value minimum distribution, we can see that 83.2% of the time the temperature is ≥95°F (35°C). If we did not have information about the “true” nature of the distribution, and had proceeded to use a normal distribution to estimate the percentage of temperatures in excess of 95°F (35°C), we would underestimate the percentage as shown in Figure 3, which indicates that the wastewater temperature was ≥95°F (35°C) 76.7% of the time.
Figure 3: Minitab Normal Probability Distribution of Temperature Data
In closing, I want to state that Minitab does have the ability to produce the same estimated value of 83.2% discussed above and as shown in Figure 2. This ability of Minitab is portrayed in Figure 4. The difference between @Risk and Minitab though, and it is a significant difference in my opinion, is the ease with which @Risk fits the distribution for you. In fact, it fits many distributions as shown by the table listings in Table 2.
Figure 4: Minitab Smallest Extreme Value Probability Distribution of Temperature Data
Table 2: Distribution Fitting and Ranking Statistics for Selection from @Risk
Finally, using @Risk, you can also select the distribution ranking statistic that best meets your preference. I typically use the default AIC fit statistic (smaller is better) but you can choose from the five options shown in Figure 5.
Figure 5: @Risk Options for Selecting Ranking Statistic to Determine Best Fit
To learn more about @Risk visit www.palisade.com.