Number of Datapoints / Area

Hi All,

I have generated a global dataset of mountain peaks (arbitrarily located (x,y,z) triples) in a geographic coordinate systems (WGS84 - long/ lat). Now I try generating a global grid by computing the number of peaks / area. It seems that this should be a simple task for GMT but I cannot find a working approach (sorry for the trivial question). I would require counting the number of peaks within a moving circular window with a given radius (in km).

Any ideas how to perform this task?

Cheers Jörg

There is no specific module that does exactly what you want. You can do it with a square box in lon/lat via blockmean or xyz2grd. Or, you can write a script that uses gmtselect to find the number of your xyz triplets taht are inside the circle centered on each desired grid node in a loop. Should not be too hard: Pick your desired global grid parameters (increment in lon,lat), use grdmath to make an empty grid, dump it to ascii, then loop over the records and make a single-record point file that you use in gmt select -C to find the points inside that radius and then add lon lat count to a growing output table that you then xyz2grd make into a grid in the end.

Feel free to comment on my feature request at https://github.com/GenericMappingTools/gmt/issues/3942 if I missed a functionality.

Dear Paul,

Thank you for showing a working solution for this problem. I will give it a try and write a little script following your recipe.

The feature request perfectly describes the missing functionality. I was thinking that nearneighbor could do the job but there is no option for writing the sum the of data points within the search radius and as far as I understand it just takes the nearest neighbor of each sector and not all data points within the search radius. Anyhow, gmtcount sounds great!

Cheers
Jörg

Hi Paul, colleagues,

I have translated Paul’s recipe into a small bash script. As expected the bash script has a bad performance and is not suitable for a large scale / high-resolution analysis. Anyhow, maybe this workaround is useful for somebody else. I think slicing large domains into smaller tiles by a concurrent reduction of data points should increase the speed and may allow a simple form of parallelization. Any further ideas to improve the performance?

Here is the bash script

#Recipe of Paul
#-----------------------------------------------------------------------------------------------------
#Decide on your grid specifications and make a dummy empty grid with grdmath
#Dump the grid with grd2xyz to ascii
#Write a script that loops over the records in this list of nodes
#For each record, dump it to a point.txt file
#Run gmtselect -Cpoint.txt +dradius on the whole dataset, then count the number of records found
#Add the node location and the count to a growing output table
#Run xyz2grd on that table to make the final grid

function NumberOfDataPoints()
{

grdmath $reg $res -r 0 = dummy.grd
grd2xyz dummy.grd | awk ‘{print $1,$2}’ > Nodes

echo lon lat NoP > Density.txt
while read line; do
echo “$line” > point.txt
NumberOfPeaks=$(gmtselect peaks -Cpoint.txt+d"$radius"km -fg | wc -l)
echo “$line” $NumberOfPeaks >> Density.txt
echo “$line” $NumberOfPeaks
done < Nodes

xyz2grd Density.txt $reg $res -r -h1 -GDensity.grd

#clean up
rm dummy.grd Nodes point.txt
}

#Define Region of interest
W=0
E=20
S=40
N=50

reg=-R"$W"/"$E"/"$S"/"$N"

#Resolution of the resulting grid
res=-I30m

#Name of the input dataset
data=peaks

#Radius of the moving Window in [km]
radius=100

#Generate Subset of Input Data: Region of Interest + Buffer larger the Radius of gmtselect
#Buffer in degrees
Buffer=1

Win=$(echo “$W - $Buffer” | bc -l)
Ein=$(echo “$E + $Buffer” | bc -l)
Sin=$(echo “$S - $Buffer” | bc -l)
Nin=$(echo “$N + $Buffer” | bc -l)
regInputData=-R"$Win"/"$Ein"/"$Sin"/"$Nin"

#Preprocess the point dataset --> Col 1 : long; Col 2 : lat
cat Peaks.P500.Sorted | awk ‘BEGIN{FS=","}{print $5,$4,$6,$7}’ | gmtselect $regInputData > $data

#Call function to count data within the defined radius
NumberOfDataPoints

FYI, see https://github.com/GenericMappingTools/gmt/pull/3950

Wow – this was fast! I’m keen to try it :slightly_smiling_face:

Thank you!

Hi all,

I just saw the discussion about the new module on github and wondering if the code is already part of the developement branch of GMT? I’ve just upgraded to “GMT - The Generic Mapping Tools, Version 6.2.0_fff59e9_2020.08.13” but can not find a module “gmtcount” or “gmtstats”.

cheers
Jörg

It is not merged into master but is presently in a branch called gmtcount, which you could check out and build. Likely to change but not tonight.

Ok - thank you for the fast response. It seems that the branch gmtcount requires a username and a password.
Anyhow - I think I can wait until the module is part of the master branch.

Thank you for all the effort developing GMT over so many years!

-J

Ooops - Eventually I could download the branch (used a wrong adress … ). The module is doing exaclty what I was seeking for and it is really fast :slightly_smiling_face:. I love this tool.

Thank you!!!
Jörg