Problem reading NASA/USGS-produced geoTIFFs for Venus data

planetpatman · June 20, 2024, 7:35pm

Dear GMT community forum,

We are having problems with a (n alleged) geoTIFF file created by NASA containing data from the Magellan mission to Venus. Actually several of them, but for the purposes of this post, I will confine myself to a relatively small file (although we eventually want to use a huge one).

We are trying to use pyGMT to display this Synthetic Aperture Radar (SAR) data for Venus volcanic structures, but apparently the GMT means of importing a geoTIFF (via GDAL) does not recognize the geographic nature of the data, interpreting the geographic bounds (lon/lat) that we are requesting as the native (x/y) coordinates of the TIFF image.

And yes, our end product will be used with pyGMT, but I’m also using “regular” GMT6 to work on this, and both modes are experiencing the same problem, so this is not a “GMT” vs “pyGMT” issue, but rather an overall GMT issue with reading these particular geoTIFFs.

Here is the output of grdinfo for the file we are trying to use:

(base) MP-McGovern-14:Venus mcgovern$ gmt6 grdinfo Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Title: Grid imported via GDAL
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Command:
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Remark:
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Pixel node registration used [Cartesian grid]
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Grid file format: gd = Import/export through GDAL
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: x_min: -19009777.1469 x_max: 19009597.8531 x_inc: 2025 name: x n_columns: 18775
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: y_min: -9505811.42656 y_max: 9504888.57344 y_inc: 2025 name: y n_rows: 9388
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: z_min: 1 z_max: 244 name: z
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: scale_factor: 1 add_offset: 0
+proj=eqc +lat_ts=0 +lat_0=0 +lon_0=0 +x_0=0 +y_0=0 +R=6051000 +units=m +no_defs

Note that the x and y information is in the raw image coordinate system and not lon/lat like it should be. Consider the equivalent output from a Mars data file (also from the same source, the NASA/USGS Astropedia site):

(base) MP-McGovern-14:HRSC_MOLA mcgovern$ gmt6 grdinfo Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif
Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif: Title: Grid imported via GDAL
Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif: Command:
Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif: Remark:
Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif: Pixel node registration used [Geographic grid]
Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif: Grid file format: gd = Import/export through GDAL
Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif: x_min: -180 x_max: 179.998447904 x_inc: 0.00337412083064 name: x n_columns: 106694
Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif: y_min: -89.9992239522 y_max: 90 y_inc: 0.00337412083064 name: y n_rows: 53347
Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif: z_min: -7917 z_max: 20834 name: z
Mars_HRSC_MOLA_BlendDEM_Global_200mp_v2.tif: scale_factor: 1 add_offset: 0
+proj=longlat +R=3396190 +no_defs

They are both geoTIFFs generated by the same agency/department, but this one (when imported by GDAL) was interpreted with the boundaries being in the correct lon/lat system, while the Venus one (also imported by GDAL) was not interpreted correctly. I will note at this point that we have tried things like grdcut, grdconvert, and even gdal_translate on the first (Venus) file, and none of these are able to recognize the geographic coordinate system. I also have some ideas about brute-force switching out of the coordinate systems (i.e., re-writing the file with the old data and new (geographical) coordinates, but I hope to avoid that (especially because the big file we’d like to use is 110 GB!!)

One difference that I see is in the last lines of the grdinfo readouts: the (correctly behaving) mars file has the simple readout “+proj=longlat +R=3396190 +no_defs”, while the (incorrectly behaving) Venus file has the more complex readout “+proj=eqc +lat_ts=0 +lat_0=0 +lon_0=0 +x_0=0 +y_0=0 +R=6051000 +units=m +no_defs”. Perhaps this information is incorrect and causing my problem? And if we edited it, perhaps the file could allow recognition of the geographic coordinate system?

Any advice y’all could give us as to how to force recognition of the geographic coordinate system would be gratefully accepted!

Best regards,
Pat McGovern

mkononets · June 20, 2024, 7:48pm

planetpatman:

Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Title: Grid imported via GDAL
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Command:
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Remark:
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Pixel node registration used [Cartesian grid]
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: Grid file format: gd = Import/export through GDAL
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: x_min: -19009777.1469 x_max: 19009597.8531 x_inc: 2025 name: x n_columns: 18775
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: y_min: -9505811.42656 y_max: 9504888.57344 y_inc: 2025 name: y n_rows: 9388
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: z_min: 1 z_max: 244 name: z
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif: scale_factor: 1 add_offset: 0
+proj=eqc +lat_ts=0 +lat_0=0 +lon_0=0 +x_0=0 +y_0=0 +R=6051000 +units=m +no_defs

The Venus grid is a projected Cartesian grid. Coordinates are not geographical, projected meters.

The Mars grid is however geographical (x = -180…+180, y = -90…+90, +proj=lonlat…)

Maybe you can look thoroughly in the source Venus files and find a geographical grid for Venus?

planetpatman · June 20, 2024, 7:54pm

Well yes, that is the nub of the problem: it seems that the Venus grid was improperly prepared. I was expecting a reading with GDAL to properly read both.

As I understand it, the two files are both TIFF images, so the “base” coordinate system in both files is NOT geographic. It is only GDAL that resolves them into geographical. But this fails for the Venus grid (and succeeds for the Mars one). That is the interaction that I am asking about here!

Thanks,
Pat

mkononets · June 20, 2024, 8:32pm

if Venus grid in geographical coordinates is not available, you need to reproject your Cartesian Venus grid. Something like below

gdalwarp Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif -t_srs "+proj=lonlat +R=6051000 +no_defs" Venus_Magellan_C3-MDIR_Global_Mosaic_2025m-geo.tif

Joaquim · June 20, 2024, 9:42pm

Neither GMT or GDAL decide to change the coordinate system by themselves. They just report what the file says it has.

Esteban82 · June 21, 2024, 2:37am

Hi Pat.

I think we had the same problem when we added the planet grids to the remote data sets.

What Paul had done at the time was to use this command to edit grdedit to change the XY values to Longitude and Latitude.

I try this command with your grid (that I downloaded)/

gmt grdedit Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif -Rd -fg -GVenus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc

Then, I got this info


$ gmt grdinfo Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: Title: Grid imported via GDAL
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: Command:
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: Remark:
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: Pixel node registration used [Geographic grid]
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: Grid file format: nf = GMT netCDF format (32-bit float), CF-1.7
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: x_min: -180 x_max: 180 x_inc: 0.0191744340879 name: longitude n_columns: 18775
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: y_min: -90 y_max: 90 y_inc: 0.0191734128675 name: latitude n_rows: 9388
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: v_min: 0 v_max: 255 name: z
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: scale_factor: 1 add_offset: 0
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: format: netCDF-4 chunk_size: 129,129 shuffle: on deflation_level: 3
Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.nc: Default CPT:
PROJCS["unknown",
    GEOGCS["unknown",
        DATUM["unknown",
            SPHEROID["unknown",6051000,0]],
        PRIMEM["Reference meridian",0],
        UNIT["degree",0.0174532925199433,
            AUTHORITY["EPSG","9122"]]],
    PROJECTION["Equirectangular"],
    PARAMETER["standard_parallel_1",0],
    PARAMETER["central_meridian",0],
    PARAMETER["false_easting",0],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AXIS["Easting",EAST],
    AXIS["Northing",NORTH]]

And this map

I am not sure if it is correct for this case.

planetpatman · June 21, 2024, 2:37am

Aha, thank you for pointing out the gdalwarp command! It almost solves my problem. It would completely solve my problem, except for the fact that it calculates the maximum latitude as something slightly higher than 90.0 degrees (likely due to roundoff error related to the irregular x and y increment values), and assigns an “NaN” to the end of the latitude vector. I will try to find some work-around for this, perhaps in one of the command line options for gdalwarp, or perhaps something in MATLAB.

Sincerely,
Pat

planetpatman · June 21, 2024, 2:37am

Thank you for the clarification on what the geoTIFFs should be doing. What this apparently means is that NASA/USGS is putting out defective geoTIFFs that aren’t true geoTIFFs for all their Venus products. This is of course of great concern to me as a Venus scientist! I will be contacting them for sure to let them know about the problem.

I will also note that I have been successfully using a Venus topography file, prepared by the same NASA/USGS group, for several months now. I could not use it in its raw form, and I figured out how to fix it: by loading it into MATLAB, using the bounds and increments of the x,y system to create the proper lon,lat system, and writing the data with the new system as a netCDF grd file. I used Kelsey Jordahl’s old “grdread2.m” and “grdwrite2.m” scripts to do this. As I mentioned in my previous message, I didn’t want to do this with the Venus global image file we ultimately want to use because it is 110 GB and my main computer only has about 1/4 of that in RAM. But it turns out when I carved a small section out of the file, MATLAB via grdread2 couldn’t read it anyway because it returns the error message “Wrong number of variables in netCDF file!” There was also a problem with the unsigned 8 bit integer format of the image file, but I can’t remember how that came up given my other problems (probably from some earlier attempt to solve the problem a different way).

In any event I’m starting to use the baseline MATLAB tools for reading netCDF (like “netcdf.open”, “netcdf.getvar” and so on), so perhaps I find another “fix the coordinate system and export” solution like I did with the topography file. Thanks for the helpful discussions everyone!

Sincerely,
Pat

planetpatman · June 21, 2024, 2:43am

Hi Esteban,

As you can see, I just now responded to mkononets and Joaquim, in appreciation of and response to their helpful comments. And now I thank you for your informative reply. I will look into the specific form of grdedit that you supplied to see if it solves my problem.

The file you downloaded contains a grayscale image for Venus, but if I mentally interpolate a rainbow color table to grayscale it looks like you have plotted the data set correctly! The funny thing is, the more elevated regions of Venus also tend to be brighter in the radar image (a combination of roughness from tectonics and elevation-related chemistry), so your map also sort of looks like a Venus topography map.

But in any event, thanks very much!

Sincerely,
Pat

espiritocz · June 21, 2024, 9:25am

just a detail, after the grdedit, extent shows -180:180/-90:90 in pixel registration. Shouldn’t -180:180-(x_inc/2)… be the correct coords?

Joaquim · June 21, 2024, 10:56am

I really do not know how to workaround the fact that the file claims to have a latitudinal extent that overflows 90 degrees by > 1/2 degree, which is a LOT

gdalinfo Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tiff
...

Upper Left  (-19009777.147, 9504888.573) (180d 0' 0.00"W, 90d 0' 0.00"N)
Lower Left  (-19009777.147,-9505811.427) (180d 0' 0.00"W, 90d 0'31.46"S)
Upper Right (19009597.853, 9504888.573) (179d59'53.89"E, 90d 0' 0.00"N)
Lower Right (19009597.853,-9505811.427) (179d59'53.89"E, 90d 0'31.46"S)
Center      (     -89.647,    -461.427) (  0d 0' 3.06"W,  0d 0'15.73"S)

Also, going from 1 to 4 bytes, which is what the grdedit does, in a 110 GB file does not seem the best idea.

Joaquim · June 21, 2024, 11:40am

That is another side of the problem. According to gdalinfo the image increments are

Data axis to CRS axis mapping: 1,2
Origin = (-19009777.146871998906136,9504888.573435999453068)
Pixel Size = (2025.000000000000000,-2025.000000000000000)

and the East limit is (see above) 179d59'53.89"E, which also does not make sense since the ~6 arc sec missing to 180 are far from 1/2 x_inc

6051000 * 2pi / 360 / 3600 * 6 = 176 m

mkononets · June 21, 2024, 2:41pm

Joaquim:

That is another side of the problem. According to gdalinfo the image increments are
Data axis to CRS axis mapping: 1,2
Origin = (-19009777.146871998906136,9504888.573435999453068)
Pixel Size = (2025.000000000000000,-2025.000000000000000)
and the East limit is (see above) 179d59'53.89"E, which also does not make sense since the ~6 arc sec missing to 180 are far from 1/2 x_inc

6051000 * 2pi / 360 / 3600 * 6 = 176 m

The following appears strange to me:

for a geographical grid n_columns should be an even number, that is, n_rows times 2. For the Venus grid above 9388*2 = 18776, while the reported n_columns is 18775, that is one column too low. This is assuming n_rows of 9388 is correct (not say 1 too high or too low). Anyway a geographical grid that covers the global region with square pixels should have x:y count ratio of 2:1.

Actual grid resolution is reported to be 2025 m. Venus grid size reported is 18775x9388
6051000* 2* pi / 18775 = 2025.0095496
6051000* pi / 9388 = 2024.901698644
18775 / 2 = 9387.5 while y n_rows is 9388.

so the selected grid size does not perfectly match the globe of Venus.

Esteban82 · June 21, 2024, 2:45pm

Well, the size of the original file (tiff file) is 176 MB and was reduced to 105 MB.

But your point is valid. I try to generated a 1-byte netcdf file (by adding =nb) but I got an error.

gmt grdedit Venus_Magellan_C3-MDIR_Global_Mosaic_2025m.tif -Rd -fg -GVenus_Magellan_C3-MDIR_Global_Mosaic.nc=nb
grdedit [ERROR]: Cannot write format nb = GMT netCDF format (8-bit integer), CF-1.7.
grdedit [ERROR]: The packed z-range, [0,255], exceeds the maximum representable size. Adjust scale and offset parameters or remove out-of-range values.
grdedit (gmtapi_export_grid): NetCDF: Numeric conversion not representable [Venus_Magellan_C3-MDIR_Global_Mosaic.nc=nb]
[Session gmt (0)]: Error returned from GMT API: GMT_GRID_WRITE_ERROR (19)
[Session gmt (0)]: Error returned from GMT API: GMT_GRID_WRITE_ERROR (19)

mkononets · June 21, 2024, 2:58pm

this almost looks like gridline registration along y axis and pixel registration along x axis, if this is possible at all (and if I correctly understand pixel and gridline registration). That is 18775 pixels and 9388 gridlines. This way Venus grid would match the planet.

Joaquim · June 21, 2024, 3:44pm

Because of the netCDF compression, and that idicates that the tiff file was not compressed, as it could have been too. Anyway, we are talking on storage size. Whenever the file is used we have to pay the price of its true size.

And yes, netCDF has had long standing problems storing 1 unsigned bytes (chars) variables.

Joaquim · June 21, 2024, 3:45pm

I remember to to have made exactly this comment about some other NASA produced grids.

mkononets · June 21, 2024, 5:01pm

But this is not possible for a grid covering the global domain, right? Unless the very first (-180 degrees) and the last (+180 degrees) column values match exactly (which is not the case for the Venus grid)?

Joaquim · June 21, 2024, 5:11pm

While technically possible, it doesn’t make any sense to me. And it would require that softwares checked that crazy case. Unless there is some technical explanation on why NASA produces this, I consider them buggy. Again, latitude S < -90 & lat N = 90??

planetpatman · June 21, 2024, 6:36pm

Hi Mikhail, Esteban, Joaquim, et al.,

Thanks for the great discussion here. It is clear that the files in question are, to use the vernacular, “f***ed up”, and that there needs to be some communication with NASA/USGS to rectify the situation. Fortunately, I have what is perhaps a route to getting some attention for this issue, through my co-Chair position on a joint NASA/ESA committee on maximizing returns from the upcoming fleet of Venus missions: the Venus Science Coordination group, or VeSCoor. I am the “NASA-side” co-Chair, and I am learning about the best channels to communicate with various elements of the Venus community, including the groups at USGS Astrogeology. So perhaps I will be able to push forward some needed changes.

I also wanted to clarify something: there are two Venus image files in this discussion. We are testing the operations on the much-degraded (and therefore much smaller) “C3” version of the image, but are hoping to work with the full-resolution image file, which is 3 orders of magnitude larger, as seen in this output from “ls -l *.tif”:

So I’m sorry, the big one wasn’t a 110 GB file, it was 117 GB

But yeah, it is stored in the natural format for grayscale images (0-255), as unsigned 8-bit integers, which perfectly match that span! Joaquim wrote the following:

This is the most CHILLING line regarding my upcoming work needs that I have read in years. How can such a “universal” format have a fundamental problem with the most basic method for storing grayscale images? This CAN’T possibly be true, can it? What were they thinking? This is one of those “this universe is broken” things for me. This also means that we need to keep the variables in the NASA/USGS-provided files they are in, rather than write them to netCDF at a (factor of 4, was it?) premium in disk space. But this also means we need NASA/USGS to provide non-messed-up files. Something I will work on in the coming months.

Anyway, thanks again all!

Pat