Re: [PATCH] Improving index selection for logical replication apply with replica identity full

Ethan Mertz <ethan.mertz@gmail.com>

From: Ethan Mertz <ethan.mertz@gmail.com>
To: Masahiko Sawada <sawada.mshk@gmail.com>
Cc: pgsql-hackers@postgresql.org, "kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com>, "onderkalaci@gmail.com" <onderkalaci@gmail.com>
Date: 2026-05-29T13:42:58Z
Lists: pgsql-hackers

Attachments

> I think it's true that for unique indexes, fewer keys lead to better
> performance. But the same is not necessarily true for non-unique
> indexes: more keys could narrow the search space.

Agreed. I have amended the patch to keep the existing behavior for
non-unique
indexes.

> I think it would be
> better to address the cases where there are no unique indexes on the
> subscriber. Even if we don't have dedicated handling for non-unique
> indexes, we can at least leave some comments.

I have amended the patch to include a comment explaining the behavior for
non-unique indexes.

Without invoking the planner, I figure it would be difficult to reason about
performance of a scan on a non-unique index. I think a beneficial future
optimization would allow users to selectively invoke the planner for their
logical apply processes. (Possibly a new configuration in the subscription).

Attached is the updated patch.

Thank you,

Ethan Mertz
SDE, Amazon Web Services

On Thu, May 28, 2026 at 6:34 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

> Hi,
>
> On Fri, May 22, 2026 at 10:18 AM Ethan Mertz <ethan.mertz@gmail.com>
> wrote:
> >
> > Hello hackers,
> >
> > I'd like to reopen the discussion on index selection for logical
> replication apply for replica identity full. Since PostgreSQL 14, replica
> identity full is able to make use of existing indexes [1][2] (authors in
> CC) when replicating UPDATE or DELETE operations.
> >
> > Today, when identifying which index to use for the update or delete, the
> first suitable index is chosen by OID order, which generally corresponds to
> creation order. If the chosen index has low cardinality, the lookup may
> perform no better than a sequential scan. While avoiding replica identity
> full is generally recommended, some users need to maintain REPLICA IDENTITY
> FULL to support downstream logical consumers that require full row images.
> These users would also like performant PostgreSQL to PostgreSQL replication.
> >
> > I propose improving the index selection heuristic to prefer unique
> indexes, favoring those with fewer columns. Previous discussion in the
> linked threads avoided invoking the planner for full index selection; the
> heuristic I propose serves as a middle ground. A unique index guarantees
> that each tuple match requires at most one index scan, and among unique
> indexes, fewer columns means a narrower, more efficient lookup. I have
> attached a patch implementing this check.
>
> +1
>
> I think it's true that for unique indexes, fewer keys lead to better
> performance. But the same is not necessarily true for non-unique
> indexes: more keys could narrow the search space. I think it would be
> better to address the cases where there are no unique indexes on the
> subscriber. Even if we don't have dedicated handling for non-unique
> indexes, we can at least leave some comments.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
>