May 23 2008
United States Climate Reference Network
I’ve been busy over the last couple days, not only with my real work, but also trying to decode the CRN data. About 2 weeks ago, Anthony Watts posted the link to the FTP site where anyone can download the temperature data from the new network. There are many files. And that’s an understatement. They have one file for every hour that has passed since the initiation of the network in late 2005.
Simply downloading the data took about a day. I have over 21 thousand files, which take up over 185 MB of hard disk space. And this is for only 2 and a half years of data. If we assume that this network will be in place for as long as the USHCN (~100 years), this means that if NCDC keeps using the same file storage a century’s worth of data would total 7.4 gigabytes and amount to over 840 thousand files.
Clearly this file management system is not scalable.
In addition to the problem of file storage format, they have decided to change the format of the files somewhere in 2007. Early years had just the ASCII file (ick) with columns for a variety of stuff. Then they decided to include a 2 row header that I can’t decipher. So a lot of good that does me. The header isn’t a problem, but it is if you’re looking at the readme and see no reference made to the change. Tis okay now though, for me anyway.
As far as I can tell, there is no metadata for the USCRN right now. They include the station latitude and longitude in the data files (waste of space), but there is no mention of how to decode the WBANNO or COOPNO. It wouldn’t be that hard to look them up in Google Earth.
After loading all the data, I put it into a NetCDF file, available at uscrn.nc. This is an 11 MB file. That’s a factor of 10 smaller. I could probably make it smaller too. It doesn’t not contain the precipitation data, so it only has about half the data as in the original files.
This is a graph of temperature for one of the stations. Can you tell which one?
|
Related Posts:
10 Responses to “United States Climate Reference Network”
To reduce spam, comments are automatically closed 30 days after the last comment. If you would like to comment on any closed thread, please use the contact form at the top of this page.


Why is 7G per century a problem? Disk space is free nowadays. And ascii is a good idea.
[Reply: It's not a problem with storage. It's a problem with transmission. ASCII is okay if there are small files, but once the files get large enough that opening them becomes impossible then they're no better than binary files.]
If it’s in the US and temperature is Celsius, Death Valley.
Anchorage, AK?
Steve, is that you?
Can’t be Celsius because surface temps don’t reach ~140º in the shade.
Fairbanks or Barrow, AK or maybe Chatham, MI.
Best,
D
[Reply: You're right that it can't be Celsius. But that's what the documentation says it is. And the temperatures at the other stations don't get low enough for it to be Fahrenheit.]
Good work ATMOZ.
I looked at the data and figured my days of copying an ascii file in to excel are over.
That said, since you have hourly into NetCDF, perhaps we can persuade some guys to write routines to use your NetCDF and
1. Output Daily TMAX,TMIN,TAVE,(TMAX+TMIN)/2
2. Monthy Values… see above.
3. Yearlies..
Looks like a problem for matlab. I never really liked excel myself. I once graphed a waveform of a vibrating beam in excel. I had about 30,000 points and excel just can’t handle that magnitude of data. If I made a typo, it took 3-5 minutes to fix cause the program became sluggish. I think I might make this a side project of mine.
My guess is the Univ. of Arizona parking lot.
So is it Fairbanks or Barrow, Atm?
Best,
D
[Reply: I don't know where it is. But it's one of the warmest of the CRN stations, so a good guess would be neither Fairbanks or Barrow.]
Wow. Scary. Likely Stovepipe Wells, then, but I still have a hard time believing all those obs over 55º. That set isn’t over grass like metro or ag stations.
Best,
D