RE: logical decoding and replication of sequences, take 2

Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>

From: "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>

To: 'Tomas Vondra' <tomas.vondra@enterprisedb.com>, Amit Kapila <amit.kapila16@gmail.com>

Cc: "Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Masahiko Sawada <sawada.mshk@gmail.com>, Peter Eisentraut <peter.eisentraut@enterprisedb.com>, Dilip Kumar <dilipbalaut@gmail.com>

Date: 2023-12-01T11:08:16Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Migrate logical slots to the new node during an upgrade.
- 29d0a77fa660 17.0 cited
Make test_decoding ddl.out shorter
- d6677b93c79b 17.0 landed
- c5c5832600e9 14.9 landed
- b1dc946eee3d 16.0 landed
- 3bb8b9342f8a 15.4 landed
Fix snapshot handling in logicalmsg_decode
- 949ac32e1267 15.3 landed
- 8b9cbd42b61f 14.8 landed
- 4df581fa0f4b 13.11 landed
- 497f863f0598 12.15 landed
- 8de91ebf2ac1 11.20 landed
- 7fe1aa991b62 16.0 landed
doc: Adjust a few more references to "postmaster"
- 17e72ec45d31 16.0 cited
Revert "Logical decoding of sequences"
- 2c7ea57e56ca 15.0 cited

Attachments

perf_results.txt (text/plain)

Dear Tomas,

> I did some micro-benchmarking today, trying to identify cases where this
> would cause unexpected problems, either due to having to maintain all
> the relfilenodes, or due to having to do hash lookups for every sequence
> change. But I think it's fine, mostly ...
>

I did also performance tests (especially case 3). First of all, there are some
variants from yours.

1. patch 0002 was reverted because it has an issue. So this test checks whether
   refactoring around ReorderBufferSequenceIsTransactional seems really needed.
2. per comments from Amit, I also measured the abort case. In this case, the
   alter_sequence() is called but the transaction is aborted.
3. I measured with changing number of clients {8, 16, 32, 64, 128}. In any cases,
   clients executed 1000 transactions. The performance machine has 128 core so that
   result for 128 clients might be saturated.
4. a short sleep (0.1s) was added in alter_sequence(), especially between
   "alter sequence" and nextval(). Because while testing, I found that the
   transaction is too short to execute in parallel. I think it is reasonable
   because ReorderBufferSequenceIsTransactional() might be worse when the parallelism
   is increased.

I attached one backend process via perf and executed pg_slot_logical_get_changes().
Attached txt file shows which function occupied CPU time, especially from
pg_logical_slot_get_changes_guts() and ReorderBufferSequenceIsTransactional().
Here are my observations about them.

* In case of commit, as you said, SnapBuildCommitTxn() seems dominant for 8-64
  clients case.
* For (commit, 128 clients) case, however, ReorderBufferRestoreChanges() waste
  many times. I think this is because changes exceed logical_decoding_work_mem,
  so we do not have to analyze anymore.
* In case of abort, CPU time used by ReorderBufferSequenceIsTransactional() is linearly
  longer. This means that we need to think some solution to avoid the overhead by
  ReorderBufferSequenceIsTransactional().

```
8 clients  3.73% occupied time
16 7.26%
32 15.82%
64 29.14%
128 46.27%
```

* In case of abort, I also checked CPU time used by ReorderBufferAddRelFileLocator(), but
  it seems not so depends on the number of clients.

```
8 clients 3.66% occupied time
16 6.94%
32 4.65%
64 5.39%
128 3.06%
```

As next step, I've planned to run the case which uses setval() function, because it
generates more WALs than normal nextval();
How do you think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED