Lies, Damned Lies, and… Snowstats?

I was involved in a discussion on Opensnow yesterday. It started with a comment from “gman” that said: “Colorado River Basin SWE at 95% of average as of right now (averaged from all SNOTEL sites). What a great start to this season!” That’s the whole post. Grammatically, there are two sentences. Since there are only two sentences in the post, the obvious implication is that the first sentence implies the second. I pointed out that 95% of average is below average. This is only strictly true if all values are non-negative though, which is the case for snowfall data. The lovely commenters at Opensnow didn’t like that much.

One reply stated that “averages don’t mean much, especially here in CO”. They then went on to compare this year to the last few. This is a great example of recentism. Recentism is the emphasis on recent events without a long-term, historical view, thereby inflating the importance of a topic that has received recent public attention. When we compare to only the last few years, we easily forget the longer-term past. That’s why long-term averages exist. SNOTEL uses the median from the period 1981-2010 as their baseline average. That’s a nice long period and it’s value clearly has meaning or else it wouldn’t be published.

Another commenter compared the 95% value to the value a few weeks ago, which was around 0%. This is recentism in the extreme! Yes, we had a nice snowstorm. Yes, there is a lot more snow on the ground now than a few weeks ago. That may mean that there is good skiing right now, but it doesn’t say anything about the total amount of snow that has fallen.

The original posted reminded me “don’t forget the + or – 5% accuracy typically associated with these types of measurements”. I’m not sure what that means. All values have an error associated with them. Just because a value is reported as 95% +/- 5% doesn’t mean that it’s 100%. It means it’s probably somewhere between 90% and 100% with 95% being the best guess.

My favorite responses were “Whats up debbie downer? Sheesh” and “You and Nathan need to hold hands, take a long wallk and talk about averages till the sun sets.”

copper_error_distributionThis is the error distribution of Opensnow forecasts. You can see that the errors are roughly normally distributed. There’s a peak near zero error and a longer tail on the negative side (under-prediction). But you can’t say that the distribution means nothing! There are many things that pop out immediately from this histogram.

  1. Opensnow over-predicts more than it under-predicts
  2. Under-predictions have larger errors than over-predictions
  3. Predictions are generally good; there aren’t many errors over 4″
  4. Most errors are between -1.5″ and  1.5″

What this histogram doesn’t show is how the forecast error varies with the actual snowfall amount. For tomorrow, I’ll try to create a scatterplot that shows this. What I expect to find is that as the actual snowfall increases there will be a corresponding increase in the forecast error.

 

My name is Nathan Johnson and I have a Masters degree in Meteorology. I also snowboard and live in Boulder, Colorado.I have a strong desire to see precise, accurate snow forecasts. It is my hope that independent validation and verification leads to better forecasts.

Posted in snowstats Tagged with: ,

Leave a Reply

Your email address will not be published. Required fields are marked *

*