Re: logical decoding and replication of sequences, take 2

Tomas Vondra <tomas.vondra@enterprisedb.com>

From: Tomas Vondra <tomas.vondra@enterprisedb.com>

To: Amit Kapila <amit.kapila16@gmail.com>

Cc: "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>, "Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Masahiko Sawada <sawada.mshk@gmail.com>, Peter Eisentraut <peter.eisentraut@enterprisedb.com>, Dilip Kumar <dilipbalaut@gmail.com>

Date: 2023-12-05T16:53:37Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Migrate logical slots to the new node during an upgrade.
- 29d0a77fa660 17.0 cited
Make test_decoding ddl.out shorter
- d6677b93c79b 17.0 landed
- c5c5832600e9 14.9 landed
- b1dc946eee3d 16.0 landed
- 3bb8b9342f8a 15.4 landed
Fix snapshot handling in logicalmsg_decode
- 949ac32e1267 15.3 landed
- 8b9cbd42b61f 14.8 landed
- 4df581fa0f4b 13.11 landed
- 497f863f0598 12.15 landed
- 8de91ebf2ac1 11.20 landed
- 7fe1aa991b62 16.0 landed
doc: Adjust a few more references to "postmaster"
- 17e72ec45d31 16.0 cited
Revert "Logical decoding of sequences"
- 2c7ea57e56ca 15.0 cited

Attachments

alter-sequence-master.perf.gz (application/gzip)
alter-sequence-optimized.perf.gz (application/gzip)
diff-alter-sequence.perf.gz (application/gzip)
diff-nextval.perf.gz (application/gzip)
diff-nextval-40.perf.gz (application/gzip)
nextval-40-master.perf.gz (application/gzip)
nextval-40-optimized.perf.gz (application/gzip)
nextval-master.perf.gz (application/gzip)
nextval-optimized.perf.gz (application/gzip)

On 12/5/23 13:17, Amit Kapila wrote:
> ...
>> I was hopeful the global hash table would be an improvement, but that
>> doesn't seem to be the case. I haven't done much profiling yet, but I'd
>> guess most of the overhead is due to ReorderBufferQueueSequence()
>> starting and aborting a transaction in the non-transactinal case. Which
>> is unfortunate, but I don't know if there's a way to optimize that.
>>
> 
> Before discussing the alternative ideas you shared, let me try to
> clarify my understanding so that we are on the same page. I see two
> observations based on the testing and discussion we had (a) for
> non-transactional cases, the overhead observed is mainly due to
> starting/aborting a transaction for each change;

Yes, I believe that's true. See the attached profiles for nextval.sql
and nextval-40.sql from master and optimized build (with the global
hash), and also a perf-diff. I only include the top 1000 lines for each
profile, that should be enough.

master - current master without patches applied
optimized - master + sequence decoding with global hash table

For nextval, there's almost no difference in the profile. Decoding the
other changes (inserts) is the dominant part, as we only log sequences
every 32 increments.

For nextval-40, the main increase is likely due to this part

  |--11.09%--seq_decode
  |     |
  |     |--9.25%--ReorderBufferQueueSequence
  |     |     |
  |     |     |--3.56%--AbortCurrentTransaction
  |     |     |    |
  |     |     |     --3.53%--AbortSubTransaction
  |     |     |        |
  |     |     |        |--0.95%--AtSubAbort_Portals
  |     |     |        |          |
  |     |     |        |           --0.83%--hash_seq_search
  |     |     |        |
  |     |     |         --0.83%--ResourceOwnerReleaseInternal
  |     |     |
  |     |     |--2.06%--BeginInternalSubTransaction
  |     |     |          |
  |     |     |           --1.10%--CommitTransactionCommand
  |     |     |                     |
  |     |     |                      --1.07%--StartSubTransaction
  |     |     |
  |     |     |--1.28%--CleanupSubTransaction
  |     |     |          |
  |     |     |           --0.64%--AtSubCleanup_Portals
  |     |     |                     |
  |     |     |                      --0.55%--hash_seq_search
  |     |     |
  |     |      --0.67%--RelidByRelfilenumber

So yeah, that's the transaction stuff in ReorderBufferQueueSequence.

There's also per-diff, comparing individual functions.

> (b) for transactional
> cases, we see overhead due to traversing all the top-level txns and
> check the hash table for each one to find whether change is
> transactional.
> 

Not really, no. As I explained in my preceding e-mail, this check makes
almost no difference - I did expect it to matter, but it doesn't. And I
was a bit disappointed the global hash table didn't move the needle.

Most of the time is spent in

    78.81%     0.00%  postgres  postgres  [.] DecodeCommit (inlined)
      |
      ---DecodeCommit (inlined)
         |
         |--72.65%--SnapBuildCommitTxn
         |     |
         |      --72.61%--SnapBuildBuildSnapshot
         |            |
         |             --72.09%--pg_qsort
         |                    |
         |                    |--66.24%--pg_qsort
         |                    |          |

And there's almost no difference between master and build with sequence
decoding - see the attached diff-alter-sequence.perf, comparing the two
branches (perf diff -c delta-abs).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company