Filter1d and sample1d with no-data colums

Malte.Thoma · May 25, 2022, 8:30am

Again a question with respect to my transition from GMT4 to GMT6:
My input data looks like this (with many more lines):

2022-04-25T02:27:02 : 48.62 46.77
2022-04-27T04:27:01 : 89.23 88.81
2022-04-29T06:27:01 : 91.69 90.44
2022-05-01T08:27:01 : 96.31 95.52
2022-05-03T10:27:01 : 95.69 93.32
2022-05-05T12:27:02 : 94.15 90.71
2022-05-07T14:54:02 : 88.31 85.71
2022-05-09T16:54:02 : 85.23 82.53

cat | sample1d -fT -I1 --TIME_UNIT=d
results in
sample1d [ERROR]: Input data have 1 column(s) but at least 2 are needed

What helps would be this WORKAROUND:
cat | sample1d -fT -I1 --TIME_UNIT=d
and (after quite some time) this "SOLUTION:
cat | sample1d -fT -I1 --TIME_UNIT=d -i0:3

I do acknowledge that there are changes in GMT6, but handling the second column with “:” only if the -i option is provided ??? I do not understand the reason for this! Shouldn’t
-i0:NF
should be the default? Why do I have to provide it at all?
Perhaps you can explain to me why you choose this more complex approach? So I might understand GMT6 better.

Kind regards,
Malte

In the manpage I found the “-d” option, which sounds as should it do the job, but I can’t get it it working:

Joaquim · May 25, 2022, 11:04am

So you have a most unusual data format and complain that a convoluted procedure is needed to read it.

What should it be taken as the column separator in your file: spaces, colon or the " : " word?
Or in other words, what is that stray colon after the datetime doing there?

Malte.Thoma · May 25, 2022, 11:22am

I just would like to know why
-i0:3
is needed in this case? It should be the default, shouldn’t it?
default= from the first to the last column without “-i”

Joaquim · May 25, 2022, 11:40am

Programs use certain characters to detect where one value ends and another one starts. The very well known designation CSV means “comma separated value” and means that values are separated by commas. But in fact this is a generic name that include values separated by spaces (in fact spaces or tabs), or any other character but in later case the program must be informed about that separating character. Your file is using BOTH spaces and a : so the question should be more: How the (character) is GMT still able to read this file

Malte.Thoma · May 25, 2022, 12:19pm

How the (character) is GMT still able to read this file
as in GMT4
First column -> Time
All other columns will be filtered/sampled
When a column does contain anything that is not a number (like “:”) create NaN.

Quite simple, the result with “-i0:3” should be the very same as without it (as in GMT4)

Joaquim · May 25, 2022, 12:53pm

And it is … when files are correctly created and not relying on a previously bugged behavior that happened to satisfy your needs. Text string should NOT be converted into NaNs but remain as such, text. GMT6 keeps everything after the first text column as text, even if they are numbers. So I’m surprised how this : escaped that and it looks a bug to me that we can read anything beyond the first DateTime column as anything else but a text.

pwessel · May 26, 2022, 11:51pm

Back in GMT 5 we decided that to solve a bunch of problems related to mixed text and data and modules that need to add data to record to switch to a straight-forward ascii data format in GMT. It expects any number of numerical columns followed by an optional trailing text. Modules that add values to existing records (e.g., grdtrack, mapproject, etc.) will add to the numerical array instead of after the text since the format becomes unpredictable.

It is true that you can force GMT to read the first 4 columns as data via -i0:3, so if you have lots of files like that and standard workflows you want to keep like in GMT 4 then the simplest is to add -i0:3 or similar to those scripts.

The record format and how -i and -o works is explained in the docs. Going forward, it will be simpler in the long run to study this a bit and stick with such arrangements. GMT makes it possible to even extract specific words from the trialing text.

Malte.Thoma · May 27, 2022, 5:39am

Dear Paul,
many thanks for the historical background. It helps me to understand the apparent regression.
Kind regards,
Malte