Parallelization using PYGMT

I have been trying to parallelize portions of my code because the “reconstructed_CEED_land_simple.gmt” is incredibly large and it takes a very long time to plot. So I was thinking of parallelizing the plotting_coasts_and_land_subplots function. However, there is this error:

[11] => psconvert [ERROR]: No hidden PS file /.gmt/sessions/gmt_session.4308/gmt_1.ps- found
[12] => Module 'psconvert' failed with status code 78:
[13] => psconvert [ERROR]: No hidden PS file /.gmt/sessions/gmt_session.4308/gmt_1.ps- found

The program runs if I modified it to run linearly would not have this problem at all. Here is the code for it being parallelized. I was getting the similar problems as this post: https://github.com/GenericMappingTools/pygmt/issues/217
where the PID is not correct, but the machine I am running on is infact Linux.

def plotPanela():
    fig.basemap(region="d", projection="G120/0/60/6c", frame=["a30g30", '+t"a)"']) #set a base for drawing panel a map, and set the projection to be Azimuthal orthographic. Glon0/lat0[/horizon]/width
    fig.coast (land="skyblue",water="skyblue") #plot a blue background 
    
    
def plotPanelb():
    fig.basemap(region="d", projection="W120/12c", frame=["a30g30", '+t"b)"']) #set a base for drawing panel b map, and set the projection to be Mollweide. W[lon0/]width
    fig.coast (land="skyblue",water="skyblue") #plot a blue background 

def plotting_coasts_and_land_subplots(index):
    
    reload(pygmt)

    if(index == 0): #plotting panel A
        plotPanela()
    else:
    #     # Starting to plot panel B. But first shifting plot origin of the second map by "width of the first map + 1 cm"  
        fig.shift_origin(xshift="w+1c")
        plotPanelb()
   
    fig.plot(data = 'reconstructed_CEED_land_simple.gmt',G="skyblue2",pen="0.1p,black")  #plot flooded land (plate) polygons applied by model author     
    fig.plot(data = 'reconstructed_CEED_Exposed_Land.gmt',G="bisque1@20",pen="0.1p,black")#plot dry land (plate) polygons applied by model author

def main():
    
    pool = mp.Pool(mp.cpu_count())
    result = [pool.apply(plotting_coasts_and_land_subplots, args = (index) ) for index in range(2)]
    
    fig.savefig("/final_image.png",dpi="150")


main()

How would the parallelization manage to append the PostScript from the various sections in the right order to the same PS file being built under the hood? I dont think that is possible.

The solution is different. If your data

is incredibly large and it takes a very long time to plot

just don’t plot it. It can’t be seen anyway. I mean, plot only a subset (decimation) of it.

my team, Paleogeography Working Group of the Deep-Time Digital Earth (DDE) under UNESCO, was wondering if we can demonstrate to a core engineer our idea and what we would like to accomplish using pygmt. If the parallelization would work, it can really speed up the plotting process to our website. Here is our website: http://dev.geolex.org/index.php

We cannot easily paralyze the plot module. Since it needs to add the PostSCript to a single file in the right order, a scheme to paralyze via OpenMP would be to paralyze the loop over segments, write those to temporary files, and when the OpenMP loop completes we would have to append those pieces in the right order to the single PostScript result. There are no plans to do this at the moment.

The only workaround I can imagine is to plot the segments individually and make lots of PNG plots then stack those PNG plots. Not sure if you can stack PDF plots, for instance. It all seems like a lot of work.

Let me ask again. Why do you want to parallelize plot? Because it takes too long? And why so? Because you are plotting very big files?

Very big files have data that perhaps 99% (or more) of it are not visible because, unless you are plotting on a big wall, those points/lines will simply overlap. The solution is conceptually very simple (though in practice maybe not that much). Just drastically decimate your data.

See gmt simplify for that task.

Could you provide a sense of how many polygons you are trying to plot? And perhaps how many nodes/vertices each polygon has on average? My gut feeling is that you might want to use a tool like datashader (see https://datashader.org/user_guide/Polygons.html) to rasterize the polygons into a grid, and then overlay that grid on a GMT basemap. See e.g. sample code at https://github.com/weiji14/deepicedrain/blob/v0.4.2/atl11_play.py#L317 (sorry that it’s not well written).