This document describes the Sussie application (1.7). For the project page, go here.

2.3. Hashing

This tools has two main functionalities:

  • It creates the hash for the files with the selected extensions.

  • It checks that those hashes are still valid after having moved the dataset folder.

The task is to verify that no file corruption occurred during the handling (e.g., copying/moving) of the survey data files.

hashing

Fig. 2.3 The Hashing tab.

2.3.1. Which hashing algorithm should we use?

For the large majority of our needs, it should be sufficient to use the simplest (and quickest) one: SHA1.

The following table provides some indicative computation times after a test performed by hashing 100 FAU files (with a total size of about 20 GB):

Hashing Algorithm

Computation Time

SHA1

1.5 mins

SHA224

1.9 mins

SHA256

2.8 mins

2.3.2. Which file extensions should we hash?

At least, we should hash the FAU raw files.

It is possible to hash all the files by putting “.*” in the “File Extensions“ field, but the computation time will increase.