Summary of Virginia Tech Downloads
This is a summary of how Virginia Tech collects data from each of the SuperDARN institutions.
Click here for a pdf of all of this information
Individual institution summaries
Show Virginia Tech and JHU/APL download summary
Virginia Tech radars (bks, fhe, fhw, gbr, kap) & Wallops Island (JHU/APL)
Each of these radars has its own shell script that runs on sd-data which calls a number of sub-scripts to do different tasks. For these radars we download the following file types:
Radar | File types |
bks | rawacf, iqdat, scdlog |
fhe | rawacf, iqdat, errlog, scdlog, fft (if available) |
fhw | rawacf, iqdat, errlog, scdlog, fft (if available) |
gbr | rawacf, iqdat, scdlog |
kap | rawacf, scdlog (iqdat files are not generated here) |
wal | rawacf |
Each of these main scripts is in charge of staging up a copy of the rawacf files for the University of Saskatchewan to download to their archive. As well these main scripts handle placing the different files into the Virginia Tech archives.
As mentioned before, each of the main scripts calls sub-scripts in order to download each file type. These sub-scripts are all fairly similar and mostly differ on the speed of the download as well as what is done with the file at the radar site once a copy of the file has been downloaded to Virginia Tech.
The scripts start off by gathering a list of available files to download and collect the sha1sum hash output for each file. Then for each hash output, the file extension of the file is checked to see file will need to be compressed using the bzip2 program. Then, the year of the file is read off so that a note can be emailed if the file is from an older year than the current year or if the file is from the future. Next, the scripts do a check to see if we have a file that matches in our data archive. If we do have a file in our archive with the same filename, then we check to see if the two files have the same hash. If the filename doesn’t exist in our archive, we download the file. The scp file transfer command is used for all of these sub-scripts as a basis for transferring the data.
Once the file is downloaded, the script does a simple error code check on the scp command to see if the command finished properly. If the transfer didn’t finish properly, the script deletes the file that was downloaded as it will probably be a partial or bad file. If the scp command exited gracefully, then the first check is to make sure the file has some data in it and isn’t a zero length file. Then, the sha1sum hash is computed on the recently downloaded file. If this hash and the hash that was found at the radar site are the same, then we know we’ve downloaded that entire file. After this check, since most of the files are compressed at the radar sites into order to save space in the event of a network disconnection, a decompression test is performed on the file. If the file wasn’t corrupted in the compression process, then finally, we know we have a good file to store and pass along.
It’s at this point that the most variation between the sub-scripts occurs. One sub-script may remove the file from the radar site computer, while another sub-script might move the file into a backup directory. Again, the process of copying and moving files for our archive and for staging up for the Univ. of Sask. is handled by the main script for each radar.
Show University of Saskatchewan download summary
University of Saskatchewan radars (cly, inv, pgr, rkn, sas)
The University of Saskatchewan (USask) stages up their rawacf files whenever the files are downloaded from each radar. Virginia Tech has an account on the server that the data is staged up on and is able to perform a number of shell commands. The primary transfer method for these radars is with the use of a scp command much like what is done with the Virginia Tech radars.
After a list of files is found, the sha1sum hash output for each file is obtained. Much like before, the file is checked to see if the file already exists in our data archive and if so the hash of the two files are compared. If the file is a new file that doesn’t exist in our data archive, we proceed to download the file. Once the file is downloaded with a scp command, error checking is done on the exit code of the scp command and the size of the file that was downloaded. The hash of the recently downloaded file and the file that is staged up on the USask’s server to make sure all of the data was downloaded. Lastly, the file is checked for compression integrity with a decompression test.
If the downloaded file passes all of these tests the file is then staged up for the USask to download. The file is then moved into our data archive for further processing. Lastly, the copy of the file on the USask staging area is removed so that it will not be downloaded again.
Show University of Alaska Fairbanks download summary
University of Alaska Fairbanks radars (ade, adw, kod, mcm, sps) & ksr (NICT)
The University of Alaska Fairbanks (UAF) has setup a staging area on one of their servers in which data that is downloaded from their radars as well as King Salmon is copied into this staging area. Unlike the USask setup, the login to the UAF staging area is restrictive so that only rsync access is allowed. No shell commands can be used to transfer data.
Since rsync is the only allowed access, all of the files for a radar are transferred from the UAF server to Virginia Tech using rsync. During this process, the files that are successfully transferred are removed automatically from the UAF server with the “—remove-sent-files” option of rsync.
Once all of the files are downloaded, each file is check for the file size and the compression extension on the filename (and converted to bzip2 if necessary). Then, the filename is checked to see if the same filename is in our archives. If the file is in our archive, the sha1sum hash outputs are compared between the two files and the appropriate error message is output. If the filename is completely new, then the file is checked that it passes a decompression test. If the file passes this test, then the file is staged up for USask to downloaded and moved into our data archive. This process only happens for rawacf files. Fitacf files are being staged up by UAF and downloaded to Virginia Tech, but these files are placed in a temporary directory in case there is ever a question about fitacf files being generated at a radar versus in post-processing.
Show IRAP, CNRS/LCPE download summary
IRAP, CNRS/LCPE radar (ker)
Data coming from the Kerguelen radar is limited due to bandwidth restrictions by the network provider. On a daily basis, only 8 hours of rawacf files and 24 hours of fitacf files are downloaded from the radar site. Every few months, the accumulated data is copied over to a physical disk and send to CNRS/LCPE where is it uploaded to a staging area. Virginia Tech has an account on the staging area server and is able to perform a number of shell commands. The primary transfer method for these radars is with the use of a scp command much like what is done with the Virginia Tech radars.
After a list of files is found, the sha1sum hash output for each file is obtained. Much like before, the file is checked to see if the file already exists in our data archive and if so the hash of the two files are compared. If the file is a new file that doesn’t exist in our data archive, we proceed to download the file. Once the file is downloaded with a scp command, error checking is done on the exit code of the scp command and the size of the file that was downloaded. The hash of the recently downloaded file and the file that is staged up on the CNRS/LCPE’s server to make sure all of the data was downloaded. Lastly, the file is checked for compression integrity with a decompression test.
If the downloaded file passes all of these tests the file is then staged up for the USask to download. The file is then moved into our data archive for further processing. Lastly, the copy of the file on the CNRS/LCPE’s staging area is removed so that it will not be downloaded again.
Show SANSA download summary
SANSA radar (san)
The South African National Space Agency (SANSA) stages up their rawacf files whenever the files are downloaded from the SANAE radar. Virginia Tech has an account on the server that the data is staged up on and is able to perform a number of shell commands. The primary transfer method for these radars is with the use of a scp command much like what is done with the Virginia Tech radars.
After a list of files is found, the sha1sum hash output for each file is obtained. Much like before, the file is checked to see if the file already exists in our data archive and if so the hash of the two files are compared. If the file is a new file that doesn’t exist in our data archive, we proceed to download the file. Once the file is downloaded with a scp command, error checking is done on the exit code of the scp command and the size of the file that was downloaded. The hash of the recently downloaded file and the file that is staged up on the SANSA’s server to make sure all of the data was downloaded. Lastly, the file is checked for compression integrity with a decompression test.
If the downloaded file passes all of these tests the file is then staged up for the USask to download. The file is then moved into our data archive for further processing. Lastly, the copy of the file on the SANSA staging area is removed so that it will not be downloaded again.
Show STELab download summary
STELab radar (hok)
The Solar-Terrestrial Environment Laboratory (STELab) stages up their rawacf files whenever the files are downloaded their radar. Virginia Tech has an account on the server that the data is staged up on and is able to perform a number of shell commands. The primary transfer method for these radars is with the use of a scp command much like what is done with the Virginia Tech radars.
After a list of files is found, the sha1sum hash output for each file is obtained. Much like before, the file is checked to see if the file already exists in our data archive and if so the hash of the two files are compared. If the file is a new file that doesn’t exist in our data archive, we proceed to download the file. Once the file is downloaded with a scp command, error checking is done on the exit code of the scp command and the size of the file that was downloaded. The hash of the recently downloaded file and the file that is staged up on the STELab’s server to make sure all of the data was downloaded. Lastly, the file is checked for compression integrity with a decompression test.
If the downloaded file passes all of these tests the file is then staged up for the USask to download. The file is then moved into our data archive for further processing. Lastly, the copy of the file on the STELab staging area is removed so that it will not be downloaded again.
Show La Trobe download summary
La Trobe University radars (tig, unw)
La Trobe University has setup Virginia Tech with an account that has access to their entire archive of TIGER (Bruny Island) and Unwin data sets. Because this is not an access to a staging area, it cannot be assumed what is a new file for consideration. So, here for each radar, the entire La Trobe data archive is searched for new files that don’t exist in Virginia Tech’s archive. With thousands of files on the La Trobe archive, checking each file for an updated version would be an intensive process that would be very slow. In these download scripts, filenames that match on both the La Trobe archive and the Virginia Tech archive are assumed to be the same file without checking the contents of the file.
Once a list of new filenames is generated, the sha1sum hash outputs are collected from these files and the process of transferring data begins with a scp command of each file. Once the file is downloaded with a scp command, error checking is done on the exit code of the scp command and the size of the file that was downloaded. The hash of the recently downloaded file and the file that is staged up on the La Trobe’s server to make sure all of the data was downloaded. The file is then checked for the file extension as almost all files on the La Trobe archive are gzipped. Gzipped files are converted to bzip2 (if necessary) and then lastly, the file is checked for compression integrity with a decompression test.
If the downloaded file passes all of these tests the file is then staged up for the USask to download. The file is then moved into our data archive for further processing.
Show Dartmouth College download summary
Dartmouth College radars (cve, cvw)
Due to security concerns at Dartmouth College, a general user account to a file server was troublesome to setup. Instead, a public posting of the files on a server at Dartmouth College was made available and the files are downloaded using a wget method.
Here a list of files is generated by using wget on the index page that the files are posted to. The filenames are quickly filtered out so that a URL can be built for a subsequent wget command. Once the file is downloaded, error checking is done on the wget exit code and the size of the file that was downloaded. Since shell is not available here and sha1sum outputs are not provided, the contents of the file posted on the server and the contents of the downloaded file cannot be verified. Instead, the file is then checked with a decompression test to ensure the integrity of the file.
If the downloaded file passes all of these tests the file is then staged up for the USask to download. The file is then moved into our data archive for further processing.
Show British Antarctic Survey download summary
British Antarctic Survey radar (hal)
The British Antarctic Survey (BAS) posts data for their radar on a publicly available ftp server. As well, to check the integrity of the files, the sha1sum hash output is posted in a separate file for each file.
First a list of posted files is downloaded and the files are filtered out by simply checking if the filename already exists in the VT data archive. At this point the sha1sum hash outputs haven’t been downloaded so it’s not possible to do a check of the contents already existing files. Once a list of files has been generated, the files are then downloaded again using the ftp method.
With the downloaded rawacf files and their matching sha1sum hash outputs, the files are first checked for file size. Then the hash of the downloaded file is checked against the matching hash that was generated at BAS. If the hashes match, then the file is put through a final decompression test.
If the downloaded file passes all of these tests the file is then staged up for the USask to download. The file is then moved into our data archive for further processing. Since Virginia Tech does not have control of the ftp server, the files that have been downloaded are left staged until BAS moves these files.
Show ISAP download summary
ISAP radar (dce)
The Institute for Space Astrophysics and Planetology (ISAP) stages up their rawacf files whenever the files are downloaded from their radar. Virginia Tech has an account on the server that the data is staged up on and is able to perform a number of shell commands. The primary transfer method for these radars is with the use of a scp command much like what is done with the Virginia Tech radars.
After a list of files is found, the sha1sum hash output for each file is obtained. Much like before, the file is checked to see if the file already exists in our data archive and if so the hash of the two files are compared. If the file is a new file that doesn’t exist in our data archive, we proceed to download the file. Once the file is downloaded with a scp command, error checking is done on the exit code of the scp command and the size of the file that was downloaded. The hash of the recently downloaded file and the file that is staged up on the ISAP’s server to make sure all of the data was downloaded. Lastly, the file is checked for compression integrity with a decompression test.
If the downloaded file passes all of these tests the file is then staged up for the USask to download. The file is then moved into our data archive for further processing. Here, ISAP prefers to control which files are available and which files are removed for downloading, so nothing is done by Virginia Tech to remove the downloaded file.
Show NIPR download summary
NIPR radars (sye, sys)
As of this writing, no transfer protocol has been established with the National Institute for Polar Research.Show University of Leicester download summary
University of Leicester radars (han, pyk) & sto (CNRS/LCPE)
The University of Leicester posts data for their radar on a server which Virginia Tech has a sftp account on. As well, to check the integrity of the files, the sha1sum hash output is posted in a separate file for each file.
First, the hash output files are downloaded to Virginia Tech. Then for each hash output file, the corresponding filename is checked to see if it already exists in the Virginia Tech archives. If the file does exist, the contents of the hash output file are examined to see if the two hashes match. If the filename does not exist, then the file is added to a download list. The rawacf files are then downloaded to Virginia Tech using the sftp command.
With the downloaded rawacf files and their matching sha1sum hash outputs, the files are first checked for file size. Then the hash of the downloaded file is checked against the matching hash that was generated at the University of Leicester. If the hashes match, then the file is put through a final decompression test.
If the downloaded file passes all of these tests the file is then staged up for the USask to download. The file is then moved into our data archive for further processing. Lastly, the rawacf filename as well as the corresponding hash output filename are added to a list of files to be removed from the University of Leicester staging area. Once all the files have been checked, a last sftp command is issued to remove all of the successfully downloaded files.
Show PRIC download summary
PRIC radar (zho)
Currently, the Polar Research Institute of China (PRIC) is the only SuperDARN institution that pushes data to Virginia Tech instead of the data being downloaded by Virginia Tech. Our colleagues there have noted that the data transfer is much quicker if they can upload the data to our server. Updates from the Zhongshan radar is only available once a year since the rawacf files are transported by physical disk back from the Antarctic once a year.
PRIC has been given an account on a staging server and uploads the data to this staging server. Upon periodic checking some of the uploaded data is checked for file size as well as compression integrity. If the file passes both of these tests, then the file is staged up for USask to download them as well as the file is moved into the Virginia Tech data archive.
A program, FileZilla was used during the May 2014 upload period to upload the majority of the Dec. 2012 to Dec. 2013 data.