Exponential slowdown when using gmt with Python subprocess and multiprocessing

j762 · December 17, 2021, 3:57pm

I’m having a weird issue.

Rocky Linux 8.5
Python 3.8.3
gmt 5.3.0_r16358

Essentially, we generate a presentation, which is a collection of EPS files. These are all being generated by gmt, in parallel. In a following step, these eps files are being combined to a PDF, but that’s not what’s giving me trouble.

The plan of attack is that a python script uses the multiprocessing module, and Pool.imap(), to parallelize the EPS generation. Each worker process then calls a gmt command via subprocess(). The gmt commands are of the form:

gmt options >> tmp.eps && gmt options >> tmp.eps

etc. etc, in practice a couple of dozen individual commands. Sometimes there’s also a pipe involved.

What I observe is, that if I run with one worker thread, so, effectively in serial, each EPS file is generated in under two seconds. Not much variation. But if I parallelize the operation, the completion times increase exponentially, starting out at around the same time as in the serial case, but then increasing to over two minutes!

Is there a known bottleneck when running many concurrent gmt processes?

j762 · December 20, 2021, 3:14pm

I got a little further.

It’s definitely linked to having the working directory on and older NFS share.
It’s likely to be linked to repeated writing of gmt.conf and/or gmt.history.

j762 · December 20, 2021, 7:09pm

Figured it out. It was a combination of many calls to gmt (over 300) in a very short timespan, with all of the processes using the same temporary directory. This means that they were all fighting to write to gmt.history. This can be fixed with:

gmtset GMT_HISTORY false

The second part of the problem was that this temporary directory is an NFS share, provided by an older version of NFS, and accessed by a newer NFS client, which causes these problems.