Please add a new dataset! Don’t replace it.
comment withdrawn
The current naming convention and directory structure do not readily support multiple versions of the datasets. I would like to support multiple versioning of the datasets as part of FAIR principles. But of course it can very quickly become a complicated challenge.
I’m thinking perhaps we could create a subdirectory to hold the versioned instances of the datasets but have the unversioned path/filename (eg. server/earth/earth_faa/earth_faa_01m_p) point to the latest or most up to date version.
How has this issue been addressed within GMT in the past? Is there already a pardigm I should be aware of? I did a quick search of the forum, but didn’t come up with any guidance.
comment withdrawn
Hi Solar
Yes, the current policy is published here (Remote Datasets — GMT 6.7.0 documentation):
It is our policy to only supply the latest version of any dataset that undergoes revisions. If you require previous versions for your work you will need to get those data from the data provider separately.
Extending to version datasets looks complicated (and honestly, in my view data distribution is not exactly the main GMT business).
Our current naming scheme already has 3 parameters
```@remote_name_rru[_reg]```
and adding a new one would imply a 4rth one
??
```@remote_name_[date]_rru[_reg]```
but then `date` here would be dependent on that dataset as each of them has different release histories. And all these would need to be followed by code changes to accommodate it. … far from trivial and very doubtful that it would worth the time/effort to implement (who?) it.
BTW Why is it important to keep each dataset in both gridline and pixel registrations? I presume there must be some good reason to keep double the size of each gmt dataset?
One option that might not be too complicated is to create a tarball of the dataset, like server-earth-earth_faa-20260120.tar.gz and store it in something like an Archive subfolder. So if a user really wanted the older version to support reproducible research, they could download it in a form that is already tiled and formatted to be readily useable by GMT. It would then be up to the user to unpack the datasets into the appropriate place in their .gmt/server directory and do something like set GMT_DATA_SERVER = none and GMT_DATA_UPDATE_INTERVAL = 0 to prevent the dataset cache from being overwritten when GMT runs.
I just realized I have already been storing some old and not so old versions of some datasets archived locally.
So I withdraw all my arguments for the gmt project to keep providing access to the old versions of the remote datasets.