Thread
-
Re: [PATCH] Fix overflow and underflow in regr_r2()
Dean Rasheed <dean.a.rasheed@gmail.com> — 2026-05-28T12:37:54Z
On Sat, 23 May 2026 at 03:42, Chengpeng Yan <chengpeng_yan@outlook.com> wrote: > > Thanks for the regr_intercept.patch. The approach looks good to me. Thanks for reviewing, and sorry for the delay getting back to you. > 2. `dy` seems a bit hard to understand. Perhaps `offset`, as used in the > earlier sketch, would be clearer? [Shrug] I think dy is common enough to denote a difference in y-values, and it seems clear enough, given the large comment above it. > 3. Do we need to add tests for the underflow path, and perhaps for the > Inf/NaN guard? Yeah, I think it makes sense to include a test with underflow, since that really can lead to a large relative error. I don't think it's worth testing the Inf/NaN guard, since that's more about avoiding operating on technically uninitialised variables, and I don't believe that it actually affects the results. I've add this test case: SELECT regr_intercept(y, x) FROM (VALUES (-1e-131, 0), (2e-131, 3e-131)) v(x, y); Here, directly computing Sx * Sxy / Sxx causes an underflow to zero, while the correct result should be 1e-131. Since Sy is 3e-131, this makes a noticeable difference to the final result (without the patch, it returns an intercept of 1.5e-131, whereas with the patch, it correctly returns 1e-131). If there are no objections from the RMT, I'll push both of these (to HEAD only) in a couple of days or so. Regards, Dean