GMTdataset to/from DataFrame in Julia

I have loaded a shapefile as so:

bldg = gmtread(joinpath(directory, "bldgs_preprocs_E4.shp"))

Vector{GMTdataset} with 7191562 segments
Show first segment. To see other segments just type its element number. E.g. D[7]

Attributes:  Dict("pga" => "0.379999995231628", "su_id" => "26107.000000000000000", "construc_1" => "{\"C99/LFINF+DNO/HBET:1,3\": 307.7814279, \"C99/LFINF+DNO/HBET:4,7\": 134.7351977, \"MUR+CL99/HBET:1,3\": 88.81801297}", "constructi" => "{\"C99/LFINF+DNO/HBET:1,3\": 3, \"C99/LFINF+DNO/HBET:4,7\": 1, \"MUR+CL99/HBET:1,3\": 1}", "osm_id" => "30962401.000000000000000", "E4" => "", "mean_FlowR" => "2.176470588235294", "std_FlowR" => "1.975958442049296")
BoundingBox: [338322.16972057807, 338343.0346088757, 3.0673524229467134e6, 3.0673803243858125e6]
PROJ: +proj=utm +zone=45 +datum=WGS84 +units=m +no_defs
WKT: PROJCS["WGS 84 / UTM zone 45N",
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",87],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1]]
5Γ—2 GMTdataset{Float64, 2}
 Row β”‚ X               Y         
     β”‚ Float64         Float64   
─────┼───────────────────────────
   1 β”‚      3.38322e5  3.06735e6
   2 β”‚      3.38325e5  3.06738e6
   3 β”‚ 338343.0        3.06738e6
   4 β”‚      3.3834e5   3.06735e6
   5 β”‚      3.38322e5  3.06735e6
Vector{GMTdataset} with 7191562 segments
Show first segment. To see other segments just type its element number. E.g. D[7]

I have two questions:
How can I access the information contained on the shapefile as a dataframe?
How can I create a GMTdataset from a dataframe with a geometry column?

Thank you for your help!

Quick answers as I won’t be able to follow up on this before the end of the afternoon.

  1. You’ll need to create the dataframe yourself. I would like to have more interfaces with DataFrames but I don’t want to make that package a GMT.jl dependency.

  2. You can create GMTdatsets with the function mat2ds (see the online help)

GMTdatasets have a lot of functionality. For example you can access columns by column name and attributes. Maybe you don’t need a dataframe. See also this example that uses dataframes.

And see also all fields of the GMTdatase type (it has a geometry field)

1 Like

Thanks -

you can access columns by column name and attributes

Great ! I would just need to know how to extract an attribute as array to add to a dataframe. Could you show me how to do that? I can’t seem to be able to figure it out with the documentation.

Also … that might be a different topic all together but I can’t change the position of the legend on the plot like in this example from pygmt:

fig.legend(position="jTL+o0.1c", box=True)

Thanks a lot !

Attributes can be used in filters but not directly to select columns. See this example that uses dataframes. The author refers to it as a header but probably due to the time of writing. Now the attributes field is a Dictionary. This other example also uses the attributes but it’s a bit more confusing specially if one do have the data files used in the example.

Regarding the labels position. Have you seen these examples?

1 Like

I am making good progress and I hope to fully integrate GMT.jl into my workflow.

I am left with 2 questions to avoid PyCall … I hope this is not redundant to your previous answer, I might have missed something.

You mentioned that it is possible to select a subset based on an attribute condition in a GMT dataset (e.g. pts[:, pts.A .= 999] for a Dataframe)

How can you do this in GMT.jl for a polygons dataset?

What is the fastest way to iterate through the rows in a polygon GMT dataset?

Thanks (again)

I’m not sure about your first question but GMTdatasets implement the AbstractArray and Tables interfaces so you can address to them as simple arrays. e.g.

for n = 1:size(D,1)
    println(D[n,:])                 # Print the D rows
end

Or simplly D[20:30, 60:90] (and they can be 3D too)

Edit: And if you want them to remain as a GMTdatasets use the function mat2ds(D::GMTdataset, inds), where inds is a matrix range, e.g. (20:30, 60:90)

Thanks I will try that.

yes, keeping it as a GMTdatasest was one of the problem I had :slight_smile: