Thread
-
Re: BUG #19340: Wrong result from CORR() function
Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-03T01:27:56Z
I wrote: > I'm coming around to the conclusion that your way is better, > though. It seems good that "any NaN in the input results in > NaN output", which your way does and mine doesn't. Poking further at this, I found that my v2 patch fails that principle in one case: regression=# SELECT corr( 0.1 , 'nan' ) FROM generate_series(1,1000) g; corr ------ (1 row) We see that Y is constant and therefore return NULL, despite the other NaN input. I think we can fix that along these lines: @@ -3776,8 +3776,12 @@ float8_corr(PG_FUNCTION_ARGS) if (N < 1.0) PG_RETURN_NULL(); - /* per spec, return NULL for horizontal and vertical lines */ - if (!isnan(commonX) || !isnan(commonY)) + /* + * per spec, return NULL for horizontal and vertical lines; but not if the + * result would otherwise be NaN + */ + if ((!isnan(commonX) || !isnan(commonY)) && + (!isnan(Sxx) && !isnan(Syy))) PG_RETURN_NULL(); /* at this point, Sxx and Syy cannot be zero or negative */ (don't think it should be necessary to also check Sxy) BTW, HEAD is inconsistent: it will return NaN for this example, but only because it's confused by roundoff error into thinking that Y isn't constant. With few enough inputs, it produces NULL too: regression=# SELECT corr( 0.1 , 'nan' ) FROM generate_series(1,3) g; corr ------ (1 row) regards, tom lane