Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber
shveta malik <shveta.malik@gmail.com>
From: shveta malik <shveta.malik@gmail.com>
To: Amit Kapila <amit.kapila16@gmail.com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>, "Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>, "pgsql-hackers@lists.postgresql.org" <pgsql-hackers@lists.postgresql.org>,
shveta malik <shveta.malik@gmail.com>
Date: 2024-08-14T04:55:56Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Don't advance origin during apply failure.
- 1528b0d899d6 19 (unreleased) landed
- 2f7ffe124a9b 18.2 landed
- 63a65adf4d8e 16.12 landed
- 0ed8f1afb15f 17.8 landed
- 3f28b2fcac33 18.0 landed
- 915aafe82a7c 17.0 landed
- b39c5272c1d2 16.5 landed
-
Perform apply of large transactions by parallel workers.
- 216a784829c2 16.0 cited
Attachments
- v4-0001-Prevent-origin-progress-advancement-if-failed-to-.patch (application/octet-stream) patch v4-0001
On Tue, Aug 13, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Aug 12, 2024 at 3:37 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, Aug 9, 2024 at 2:39 PM Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> > >
> > >
> > > > + /*
> > > > + * Register a callback to reset the origin state before aborting the
> > > > + * transaction in ShutdownPostgres(). This is to prevent the advancement
> > > > + * of origin progress if the transaction failed to apply.
> > > > + */
> > > > + before_shmem_exit(replorigin_reset, (Datum) 0);
> > > >
> > > > I think we need this despite resetting the origin-related variables in
> > > > PG_CATCH block to handle FATAL error cases, right? If so, can we use
> > > > PG_ENSURE_ERROR_CLEANUP() instead of PG_CATCH()?
> > >
> > > There are two reasons to add a shmem-exit callback. One is to support a FATAL,
> > > another one is to support the case that user does the shutdown request while
> > > applying changes. In this case, I think ShutdownPostgres() can be called so that
> > > the session origin may advance.
> >
> > Agree that we need the 'reset' during shutdown flow as well. Details at [1]
> >
>
> Thanks for the detailed analysis. I agree with your analysis that we
> need to reset the origin information for the shutdown path to avoid it
> being advanced incorrectly. However, the patch doesn't have sufficient
> comments to explain why we need to reset it for both the ERROR and
> Shutdown paths. Can we improve the comments in the patch?
>
> Also, for the ERROR path, can we reset the origin information in
> apply_error_callback()?
Please find v4 attached. Addressed comments in that.
Manual testing done on v4:
1) Error and Fatal case
2) Shutdown after replorigin_session_origin_lsn was set in
apply_handle_prepare() and before EndPrepare was called.
2a) with log_min_messages=debug5. This will result in processing
of shutdown signal by errfinish() before PREPARE is over.
2b) with default log_min_messages. This will result in processing
of shutdown signal by LogicalRepApplyLoop() after ongoing PREPARE is
over.
>
> BTW, this needs to be backpatched till 16 when it has been introduced
> by the parallel apply feature as part of commit 216a7848. So, can we
> test this patch in back-branches as well?
>
Sure, will do next.
thanks
Shveta