Re:RE: Re:BUG #18369: logical decoding core on AssertTXNLsnOrder()

ocean_li_996 <ocean_li_996@163.com>

From: ocean_li_996 <ocean_li_996@163.com>
To: "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>
Cc: "'Alexander Lakhin'" <exclusion@gmail.com>, "pgsql-bugs@lists.postgresql.org" <pgsql-bugs@lists.postgresql.org>, feichanghong@qq.com
Date: 2024-03-06T03:33:45Z
Lists: pgsql-bugs
Dear Hayato Kuroda,


Thanks for your attention.


At 2024-03-05 17:24:05, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote:
>Dear Haiyang Li,
>
>...
>## Found issue
>
>### Empty transaction is decoded on PG14 and PG15
>
>However, there is a room for generating ReorderBufferTxn for empty transactions,
>which was introduced by 6b77048e5. Conditions are:
>
>1. There are sub transactions which modify only temp tables, and
>2. the top transaction modifies the catalog.
>
>The call-stack toward the generation is below.
>
>```
>ReorderBufferTXNByXid(create = true, create_as_top = true)
>ReorderBufferXidSetCatalogChanges()    // for sub transactions
>SnapBuildXidSetCatalogChanges()          // for top transaction
>DecodeCommit()                                      // for top transaction
>```
>
>The path has been introduced by 6b77048e5.
>Previously, calling ReorderBufferXidSetCatalogChanges() for sub transactions
>would be skipped, if they do not have catalog changes or they have not decoded yet.
>However, the commit ensures sub transactions must be marked as containing
>catalog changes, and this also enforces to decode transactions even if it is
>empty.
>
>### Assertion failure
>
>The empty transactions would be created as top transactions. At that time,
>AssertTXNLsnOrder() is called so that we ensured that first_lsn of top-transactions
>must be strictly higher than previous. But they can be the same if there are more
>than two empty transactions. It led an assertion failure.

>
Your analysis is correct for me.  Actually, I mentioned in [1] that I can reproduce this issue before 6b77048e5. 
After some attempts and analysis, I also believe that the issue will only occur after 6b77048e5.


> ...
>
>## Possible solutions
>
>I think there are several solutions.
>Note that I assumed here that fixes for all the versions should be almost the same.
>
>* Ease the condition in AssertTXNLsnOrder(). If the decoded transaction is empty,
>  it can be allowed that the first_lsn is same as previous one.

>  PSA file to see my consideration.
LGFM.  For my observation, the most case failed on AsserTXNOrder is checking empty
decoded transaction. Maybe we should pay more attention to review ReorderBufferTXNIsEmpty.


>* Generate a ReorderBufferTXN as sub transaction when we are in this path.
>  The approach has already been shared by you. However, note that this needs to
>  extend the ReorderBufferXidSetCatalogChanges function, and breaks ABI

>  compatibility [1].
Yes, It breaks ABI compatibility.  


>* Avoid calling ReorderBufferXidSetCatalogChanges() if the target transaction
>  has not been decoded. An concern is that ReorderBuffer does not provide an API
>  for checking whether the transaction has been already decoded or not.

>
I think this idear is  a little complex, especially when considering version compatibility.


[1] https://www.postgresql.org/message-id/6444e39.131bc.18df5c0cae3.Coremail.ocean_li_996@163.com
Best Regards,
Haiyang Li