Access greenspline's model coefficients

I am fitting a 2D depth surface (z_mod) to some 170 data points (z_obs) using greenspline. To account for noise in the data, I’m using singular value decomposition to model the surface and am currently choosing a eigenvalue cut-off subjectively, using:

gmt greenspline $dep -G$svd -L $R -Z2 -I$I -W -C+c -E$mis -V

then plotting a bunch of the resulting files.

I’d now like to arrive at a more objective cut-off value by plotting the famous “L-curve” (data variance vs. model variance), frequently used for determining the damping parameter in inverse problems:

data variance
^
|    | function of svd cut-off value 
|    |
|    |
|    |
|    |
|    \
|     \* Optimal damping
|       \_________________________
|  
+--------------------------------------- >
                            model variance

I can compute the data variance as sum((z_mod - z_obs)**2). However, I’m having problem plotting the model variance. Essentially, I’m unable to access the spline coefficients alpha in the greenspline source.

I’ve tried various combinations of -E -C+f, but was unable to access the model parameters. Or am I overseeing something? Can I compute it from the eigenvalues (supplied by the -E option)?

Best fix: Is it me, have I missed something?
Dirty fix: Can I compile greenspline easily, so I can send alpha to stdout?
Elegant fix: Would it be possible to implement e.g. an option -C+afile: “For each cutoff eigenvalue, write out the alpha coefficients to file (we automatically insert “###” before the extension, see +c and +i).”

Could they be given using -Vd ?

Nice shot, but they unfortunately aren’t.

Dear community
I could compile the greenspline source without major issues and now have my private version in which I write out the model norm as an INFORMATION.

	for (e = 0; e < (uint64_t)n_use; e++) {	/* Only loop over the first n_use eigenvalues (if restricted) */
        GMT_Report (API, GMT_MSG_INFORMATION, "Evaluate spline for eigenvalue # %d\n", (int)e);
        gmt_M_memcpy (s, ssave, nm, double);	/* Restore original values before call */
        (void)gmt_solve_svd (GMT, A, (unsigned int)nm, (unsigned int)nm, v, s, b, 1U, obs, (double)e, GMT_SVD_EIGEN_NUMBER_CUTOFF);
        /* obs (hence alpha) now has the solution for the coefficients based on the first e eigenvalues */
        for (k = 0; k < nm; k++)
	        modnorm += obs[k] * obs[k];
        GMT_Report (API, GMT_MSG_INFORMATION, "Used Eigenvalues, Model norm = %d, %g\n", e, modnorm);

Thanks for the nicely maintained and commented source code and easy compilation instructions :slight_smile:

@pwessel : Is it worth implementing into gmt with an extra option now that @wsja has basically done it?

1 Like

Looking at this, could your modnorm value be added as a column to the output file that -E sets? Would be pretty simple for me to add that, possibly via a -E modifier.

But first, why were you unable to use -C+f?

Append +f file to save the eigenvalues to the specified file for further analysis

The eigenvalues is not what I need, I need the alphas, dependent on n_use.

From C+f I get what I think would be what be the singular values (The diagonal elements of S, from G = USV^T; they are in the 1e18 - 1e0 range. I would expect the alphas to span at most 5 or 6 orders of magnitude). Moreover, I only get one set of e values (e = number of data points = number of model parameters), not e times e.

Yes, that would probably be the easiest one. In fact, two extra columns would be nice: model norm (as modnorm, above) and data norm (with the weights applied just as in the misfit function).

OK, got you this time. Let me see if I can whip out a branch for your testing,

1 Like

I dont know what you mean here. obs is an array.

obs[k] * obs[k]

OK, that is what I did. See if you can test this branch and comment on what is missing: https://github.com/GenericMappingTools/gmt/pull/7358

I think there is some syntactical mishap:

Should it be:

-E+nfile (this throws an: [Session greenspline (0)]: Error returned from GMT API: GMT_ERROR_ON_FOPEN (15))

or

-Efile+n (which does not recognize the +n flag, but simply appends an “+n” to the file name)

Syntax is

-E[misfitfile][+n]

There is no modifier to set the filename (existing syntax) but you gotta append the new modifier +n if you want effect.

Yes, I understand. However, this is not what the current implementation is doing. When writing:

greenspline $dep -G$svd -L $R -Sp -Z2 -I$I -C+c -W -Emisfit.dat+n

gmt creates a file called “misfit.dat+n”, without any extra column. The +n modifier is currently not recognized, but appended to the filename.

Are you sure you have checked out the branch green-modnorm? Or are you rebuilding master which will not do anything.?

$ git branch
  * green-modnorm
    master
$ cd build
$ cmake --build . --target install

Where my ConfigUser.cmake has:

set (CMAKE_INSTALL_PREFIX /home/myusername/.local)
set (GMT_INSTALL_MODULE_LINKS TRUE)

and

$ which greenspline
/home/myusername/.local/bin/greenspline

This previously worked well with my local commits and greenspline was the (almost) only thing recompiling after I pulled in your changes. Am I doing anything incorrectly?

My output is this (note version and synopsis. Not I have nothing to run it on so cannot say if you are wrong but the code looks for +n and strips it off like we do in lots of places in Gmt;

gmt greenspline -

gmt greenspline [core] 6.5.0_b0ef92b-dirty_2023.03.31 - Interpolate using Green's functions for splines in 1-3 dimensions

usage: gmt greenspline [<table>] -G<outfile> [-A<gradientfile>+f<format>] [-C[[n|r|v]<val>[%]][+c][+f<file>][+i][+n]] [-D<information>]
[-E[<misfitfile>][+n]] [-I<dx>[/<dy>[/<dz>]]] [-L[t][r]] [-N<nodefile>] [-Q[<az>|<x/y/z>]] [-R<xmin>/<xmax>[/<ymin>/<ymax>[/<zmin>/<zmax>]]] [-Sc|⏎
…l|t|r|p|q[<pars>]] [-T<maskgrid>] [-V[q|e|w|t|i|c|d]] [-W[w]] [-Z<mode>] [-bi<record>[+b|l]] [-d[i|o]<nodata>[+c<col>]] [-e[~]<pattern>|/<regexp>/⏎
…[i]|+f<file>] [-f[i|o]<colinfo>] [-gx|y|z|d|X|Y|D<gap>[<unit>][+a][+c<col>][+n|p]] [-h[i|o][<nrecs>][+c][+d][+m<segheader>][+r<remark>][+t<title>]]
[-i<cols>[+l][+d<divisor>][+s<scale>|d|k][+o<offset>][,...][,t[<word>]]] [-o<cols>[,...][,t[<word>]]] [-q[i|o][~]<rows>[,...][+c<col>][+a|t|s]]
[-r[g|p]] [-s[<cols>][+a][+r]] [-wy|a|w|d|h|m|s|c<period>[/<phase>][+c<col>]] [-:[i|o]] [--PAR=<value>]

I confirm I got the same version and usage.

dep.dat could be:

-126.757 50.312 27.6715 5
-126.543 50.248 32.0467 3
-126.601 50.203 30.2045 6
-127.772 50.157 15.3931 5
-126.809 50.142 28.6611 2
-126.661 50.165 27.6259 3
-126.575 50.167 28.7127 4
-126.57 50.161 31.2128 2
-126.441 50.121 31.4285 3
-125.365 50.032 48.5193 4

and

R=-R-130/-120/40/60
I=0.05
G=tmp.grd
greenspline dep.dat -G$G -L $R -Sp -Z2 -I$I -C+c -W -Emisfit.dat+n

would reproduce the problem on my side.

yep, No coffee == more likely to do copy/paste bugs. do a git pull