I’m having a weird issue.
- Rocky Linux 8.5
- Python 3.8.3
- gmt 5.3.0_r16358
Essentially, we generate a presentation, which is a collection of EPS files. These are all being generated by gmt, in parallel. In a following step, these eps files are being combined to a PDF, but that’s not what’s giving me trouble.
The plan of attack is that a python script uses the multiprocessing module, and Pool.imap(), to parallelize the EPS generation. Each worker process then calls a gmt command via subprocess(). The gmt commands are of the form:
gmt options >> tmp.eps && gmt options >> tmp.eps
etc. etc, in practice a couple of dozen individual commands. Sometimes there’s also a pipe involved.
What I observe is, that if I run with one worker thread, so, effectively in serial, each EPS file is generated in under two seconds. Not much variation. But if I parallelize the operation, the completion times increase exponentially, starting out at around the same time as in the serial case, but then increasing to over two minutes!
Is there a known bottleneck when running many concurrent gmt processes?