Can pyGMT remove duplicates when going from xyz to grid?

JohannaWren · August 28, 2025, 10:03pm

Hi,

I want to convert lon, lat, value (xyz) data to a non-overlapping evenly spaced grid and count the number of unique values in each grid cell. I can get the total count of values for each grid cell with either xyz2grd setting duplicate=‘z’ or blockmean setting summary=‘s’, but I want just the count of the unique values. Is that possible in pyGMT?

To give you an example I’m using the blockmean example with minor changes. I added an ID column where several locations have the same ID. Instead of just getting the number of earthquakes in each cell, I’d like to know how many unique IDs are in each cell if that’s possible.

import numpy as np
import pygmt

# Load sample data
data = pygmt.datasets.load_sample_data(name="japan_quakes")
# Select only needed columns
data = data[["longitude", "latitude", "depth_km"]]
# Add column with magnitudes
rng_seeded = np.random.default_rng(seed=42)
data.insert(3,'ID',rng_seeded.integers(1,5,size=115))

# Set the region for the plot
region = [130, 152.5, 32.5, 52.5]
# Define spacing in x- and y-directions (150x150 arc-minute blocks)
spacing = "150m"

fig = pygmt.Figure()

# Calculate number of total locations within 150x150 arc-minute bins
grd = pygmt.xyz2grd(data=data.drop('depth_km', axis=1), region=region, spacing=spacing, duplicate='z')

fig.grdimage(
    grid=grd,
    region=region,
    frame=["af", "+tNumber of points inside each block"],
    cmap="batlow",
)
fig.coast(land="darkgray", transparency=40)
fig.plot(x=data.longitude, y=data.latitude, style="c0.3c", fill="white", pen="1p,black")
fig.colorbar(frame="x+lcount")

fig.show()

thank you!!

Esteban82 · August 29, 2025, 2:11am

Hi Johanna. Welcome to the GMT forum!

You could try to use blockmean -Sn to count the total number of values. Then you can see how many nodes have a 1. For this, you could use grd2xyz to extract the values to a table. Finally you have to count how many “1” you have.

mkononets · August 29, 2025, 10:04am

I think it is impossible to get a count of unique values per grid cell using (py)GMT.
I thought binstats could possibly count unique values, but no, it counts the total number of values per cell, just like xyz2grd or blockmean.

Joaquim · August 29, 2025, 1:10pm

Would be a reasonable feature request

JohannaWren · September 9, 2025, 11:49pm

Thanks so much for your replies! I’ll put in a feature request and will figure out some other way to do this. I’m able to in R but I haven’t found an equally simple way in python yet.

Topic		Replies	Views
Number of Datapoints / Area Q&A	10	874	August 14, 2020
Clustering XY locations based on a certain grid Q&A	2	411	January 2, 2021
How to create a heatmap Showcase	24	4743	May 30, 2024
Density plot to earthquake Q&A	24	2319	March 8, 2023
Blockmean w/ x,y coordinates in km PyGMT Q&A	1	528	January 4, 2022

Can pyGMT remove duplicates when going from xyz to grid?

Related topics