# Issues with interpreting the correlation(r) and coefficient of determination(R) in gmt regress

I am using gmt regress with -Er -Nw -Fxymc. I am not a statistics expert but my understanding is that the coefficient of determination ® is the correlation coefficient squared, and that it is supposed to be between 0 and 1, and always positive. However, I sometimes get pretty weird values for R, name values that are negative and far outside a range of 0-1; my most extreme example is R = -510157. It is clear that the correlation is crappy when this happens, but I still don’t understand how R is computed, and what such large negative values mean, or what any negative values mean.

Hi Dietmar, yes I would agree with what you said about R. I note you are doing a RMA regression to identify outliers then give those bad boys a weight of zero and then redo the regression (-Nw). Possibly there is something odd going on in the r calculation. Might you be able to post a small dataset illustrating a case with rotten R so I can debug?

Best regression: N: 10 x0: 788.537 y0: -2628.76 angle: 65.0863 E: 920061 slope: 2.15296 icept: -4326.46 sig_slope: 0 sig_icept: 0 corr: 0.512932 R: -510157
730.19 -3328.16 -2754.38505347 -573.774946535 0 -0.278951733032 1
1429.15 -2664.21 -1249.54949804 -1414.66050196 0 -0.486958687895 1
748.66 -4388.07 -2714.61981267 -1673.45018733 0 -1.03630574387 1
1526777.89 -1022.19 3282771.0216 -3283793.2116 0 -1464.36439975 0
1362.17 -1871.66 -1393.75501124 -477.904988765 0 0.130420669993 1
922.06 -1966.48 -2341.29589526 374.81589526 0 0.486958687895 1
697.35 -2621.51 -2825.08838361 203.578383613 0 0.245696043955 1
411.4 -3067.84 -3440.72837313 372.888373131 0 0.208884127261 1
7.32 -1122.19 -4310.69797259 3188.50797259 0 1.95467490733 1
3972.23 -2208.43 4225.6096016 -6434.0396016 0 -2.61141603476 0

Thanks for looking into this, Paul. My dataset is only tiny, and perhaps does not warrant regression, but I think there may be a bug …

A few issues, see https://github.com/GenericMappingTools/gmt/pull/3213 for the resolution. One bug that affected r, another minor issue that is in the noise, and the decision that R cannot be computer/printed when misfit is not vertical (at least until we learn of equivalent expressions for orthogonal regression - so far none). Master is updated. The r now is sensible 0.482664 and R is not reported.