Re: Changing the state of data checksums in a running cluster
Daniel Gustafsson <daniel@yesql.se>
From: Daniel Gustafsson <daniel@yesql.se>
To: Tomas Vondra <tomas@vondra.me>
Cc: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com>,
Ayush Tiwari <ayushtiwari.slg01@gmail.com>,
PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>,
Heikki Linnakangas <hlinnaka@iki.fi>,
Andres Freund <andres@anarazel.de>,
Bernd Helmle <mailings@oopsware.de>,
Michael Paquier <michael@paquier.xyz>,
Michael Banck <mbanck@gmx.net>
Date: 2026-05-29T20:08:11Z
Lists: pgsql-hackers
> On 28 May 2026, at 13:51, Tomas Vondra <tomas@vondra.me> wrote: > > On 5/28/26 13:28, Daniel Gustafsson wrote: >>> On 26 May 2026, at 20:12, Tomas Vondra <tomas@vondra.me> wrote: >> >>> I suppose this means we should not be updating the checksum state >>> without emitting the barrier? I think all other places do that. >> >> Good catch, it's indeed a bug, any state change must emit a procsignalbarrier >> to maintain cluster consistency. I ended up writing a test for this very case >> as well. > > Good. I've pushed this now, along with your other findings, ahead of the beta1 deadline, buildfarm seems happy so far. >>> I still don't understand why this needs DELAY_CHKPT_START ... >> >> Having stared at this for some time, and going over old threads, I think this >> is a mistake. AFAICT though it cannot cause any error, so I'd lean towards >> erring on the safe side by leaving as is and looking at removing in 20. What >> do you think? >> > > I'd probably try to fix this for 19, otherwise it may be confusing > people looking at the code in the future. We're still months from 19 > getting released. Ofc, maybe I'm underestimating the risk. You're probably right. Once beta1 is out I'll work on getting this fixed. -- Daniel Gustafsson