I am trying to add a country field and a boolean flag to each row of a pandas dataframe with a few million records, each containing a lat and a long floating point field per row.
My plan was to add two extra columns to my pandas dataframe, one strings columns for country, and a boolean for land vs sea area.
So far, I’ve tried geopandas, but all my attempts at converting the lat/long info to country/land-sea info scaled so badly that even adding the info to a small chunk of my data frame takes ages (something like 50 or 60 rows per second).
Tried using a combination of two apply calls on the data frame and using the geopandas “cx” thingy. I have a stong feeling that there should be a better, much faster and more efficient way to do this operation, but I’m at a loss right now.
I’m hoping, given the audience of this forum, maybe someone here has faced a similar question before.
Is there a way to do this task that takes minutes, not many many hours? Any input would be highly appreciated.
So, you want to know to which country your million points belong to and if they are on land or on water? Or only later case? Because later case is easily achievable with grdlandmask and grdtrack and should take only some tens of seconds.
GMT also has country polygons in pscoast DCW and lad/sea areas that can be used to build grid masks. I’m not sure I get what you mean by country extended sea regions land+ZEE for example? That GMT does not have.
Anyway, if you want to assign a country to a million points dataset there is no escape then to loop over that million points and to find its nationality. A slow process for sure. grdtrack can do it but would need some programing.
As opposed to standard teretorial sea region assigned to a country, the sea areas touch in this file. Not sure what exactly the different legal frameworks are that make that there are two distinct sea areas for a country, a standard one and an extended one, but for this usecase I need the extended one.
I see. Those areas are what countries have submitted to UN under the famous article 76 for claiming sovereignty rights of the SEA BOTTOM that can, in best cases extend to 350 NM, … but have to cope with neighbors claims.
But regarding your case, I see no GMT usage. Only Python.