Blockmedian through pygmt.clib.Session()

Sistjerne · March 17, 2022, 10:50am

Hi all,

I am trying to calculate the percentiles P25 and P75 through blockmedian. I may not understand the session call. The code below gives an error:

pygmt.config(FORMAT_FLOAT_OUT="%.5f")
with pygmt.clib.Session() as session:
    with session.virtualfile_from_data(
        check_kind="vector", data=df  ## df is a pandas dataframe of lon, lat, value
                 ) as fin:
         with GMTTempFile() as tmpfile:
             args_str = " ".join([f"-I50k -R-25/20/76/82 -Eb", "->", tmpfile.name])
             session.call_module("blockmedian", fin + args_str)
             df_total = tmpfile.read().strip()

Error message:

pygmt.exceptions.GMTCLibError: Module 'blockmedian' failed with status code 72:
blockmedian [ERROR]: Libcurl Error: HTTP response code said error
blockmedian [ERROR]: Option -I: Must specify positive increment(s)

I have with success run the following command-line:
gmt blockmedian test.txt -I50k -Eb -R25/20/76/82 -V > out

I have tried to run the more simple function (mapproject) from Stdout of modules (e.g. mapproject) using pygmt.clib.Session.call_module. This worked too.

Any idea of what I am doing wrong?

maxrjones · March 17, 2022, 2:07pm

Why do you want to use pygmt.clib.Session rather than using pygmt.blockmedian (https://www.pygmt.org/latest/api/generated/pygmt.blockmedian.html#pygmt.blockmedian) which handles all the setup for you?

Sistjerne · March 17, 2022, 2:27pm

I do not think the summary argument is implemented for blockmedian. It is for blockmean, but I do not see it for blockmedian. Am I wrong?

maxrjones · March 17, 2022, 4:02pm

If you do not mind updating the parameter name after summary is added to blockmedian (since single-character parameters will eventually be disallowed; https://github.com/GenericMappingTools/pygmt/issues/262), you could pass E="b" to pygmt.blockmedian (e.g., pygmt.blockmedian(data=df, spacing="50k", region="-25/20/76/82", E="b").

If you want to continue with the session route, I recommend adding the virtual file name to the args_str assignment rather than joining it with args_str when calling session.call_module, then adding a breakpoint to inspect the string. You can allso add -Vd to your args_str to receive more debugging information from GMT. It looks like there is an issue with the virtual file name, in that it is being interpreted as a remote file.

Sistjerne · March 18, 2022, 8:59am

Great it works perfectly with:

pygmt.blockmedian (e.g., pygmt.blockmedian(data=df, spacing="50k", region="-25/20/76/82", E="b")

Thanks!

Sistjerne · March 18, 2022, 9:49am

I will just note, that there is some confusion with the output column names in the resulting Pandas DataFrame.

df.head():

              x                      y                      z 
index                                                 
0            45.795651      80.898771        -0.054437
1            45.791624      80.901436         0.047124
2            45.787594      80.904101        -0.042853
3            45.783563      80.906766         0.022403
4            45.779529      80.909430         0.020006

Becomes:

                                               x              y                z
-9.72413  82.15526 -0.03236 -0.09259        -0.08257       -0.00794          0.11400
-8.73843  81.97523 -0.07678 -0.15819        -0.09088       -0.05523          0.02935
-5.89476  82.03664 -0.07459 -0.13124        -0.11328       -0.04761         -0.04350

Where the first four columns are indices. The actual output is

   x         y         z      z_min            P25           P75              z_max
-9.72413  82.15526 -0.03236 -0.09259        -0.08257       -0.00794          0.11400
-8.73843  81.97523 -0.07678 -0.15819        -0.09088       -0.05523          0.02935
-5.89476  82.03664 -0.07459 -0.13124        -0.11328       -0.04761         -0.04350

weiji14 · March 18, 2022, 3:29pm

Yeah, we’ll need to support E="b" (extended report) properly in blockmedian, since right now it’s hardcoded to use the columns from the input dataframe (see https://github.com/GenericMappingTools/pygmt/blob/60eb50592a564723f98c19bd452c2debcd6098f7/pygmt/src/blockm.py#L59-L61).

Could you open a general feature request at https://github.com/GenericMappingTools/pygmt/issues/new?assignees=&labels=feature+request&template=feature_request.md&title= for adding E="b" to blockmedian? Even better if you have time to implement it directly in PyGMT