Thread

Re: [PATCH] Fix overflow and underflow in regr_r2()

Dean Rasheed <dean.a.rasheed@gmail.com> — 2026-05-16T20:34:25Z
On Sat, 16 May 2026 at 19:43, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> > On Sat, 16 May 2026 at 17:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> One simple change that might make things better is to compute
> >>      PG_RETURN_FLOAT8((Sy - Sx * (Sxy / Sxx)) / N);
> >> on the theory that the sums of products are likely to both be large.
>
> > Hmm, that isn't necessarily better.
>
> True.  We could do something like
>
>         (Sy - exp(log(Sx) + log(Sxy) - log(Sxx))) / N
>
> but this'd require additional special-case code to deal with zero
> or negative Sx and Sxy, so it's feeling rather tedious.

I just had a thought: a simpler (and probably faster and more
accurate) solution would be to use frexp() and ldexp(), which are both
part of C99, so ought to be OK.

I don't think we can ignore underflow, because Sy might be exactly
zero, in which case the second term gives the whole result.

So, in rough terms, I'm thinking of something like this:

  offset = Sx * Sxy / Sxx
  if (offset == 0 || isinf(offset))
  {
    double m_Sx, m_Sxy, m_Sxx, m_offset;
    int n_Sx, n_Sxy, n_Sxx, n_offset;

    m_Sx = frexp(Sx, &n_Sx);
    m_Sxy = frexp(Sxy, &n_Sxy);
    m_Sxx = frexp(Sxx, &n_Sxx);

    m_offset = m_Sx * m_Sxy / m_Sxx;
    n_offset = n_Sx + n_Sxy - n_Sxx;
    offset = ldexp(m_offset, n_offset);
  }

  PG_RETURN_FLOAT8((Sy - offset) / N);

Regards,
Dean