Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber

Amit Kapila <amit.kapila16@gmail.com>

From: Amit Kapila <amit.kapila16@gmail.com>
To: "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>
Cc: "pgsql-hackers@lists.postgresql.org" <pgsql-hackers@lists.postgresql.org>, "shveta.malik@gmail.com" <shveta.malik@gmail.com>
Date: 2024-08-08T06:33:30Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Don't advance origin during apply failure.

  2. Perform apply of large transactions by parallel workers.

On Thu, Aug 8, 2024 at 10:37 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
...
>
> An easiest fix is to reset session replication origin before calling the
> RecordTransactionAbort(). I think this can happen when 1) LogicalRepApplyLoop()
> raises an ERROR or 2) apply worker exits. Attached patch can fix the issue on HEAD.
>

Few comments:
=============
*
@@ -4409,6 +4409,14 @@ start_apply(XLogRecPtr origin_startpos)
  }
  PG_CATCH();
  {
+ /*
+ * Reset the origin data to prevent the advancement of origin progress
+ * if the transaction failed to apply.
+ */
+ replorigin_session_origin = InvalidRepOriginId;
+ replorigin_session_origin_lsn = InvalidXLogRecPtr;
+ replorigin_session_origin_timestamp = 0;

Can't we call replorigin_reset() instead here?

*
+ /*
+ * Register a callback to reset the origin state before aborting the
+ * transaction in ShutdownPostgres(). This is to prevent the advancement
+ * of origin progress if the transaction failed to apply.
+ */
+ before_shmem_exit(replorigin_reset, (Datum) 0);

I think we need this despite resetting the origin-related variables in
PG_CATCH block to handle FATAL error cases, right? If so, can we use
PG_ENSURE_ERROR_CLEANUP() instead of PG_CATCH()?


-- 
With Regards,
Amit Kapila.