Thread

  1. [PATCH] Improving index selection for logical replication apply with replica identity full

    Ethan Mertz <ethan.mertz@gmail.com> — 2026-05-22T17:18:12Z

    Hello hackers,
    
    
    I'd like to reopen the discussion on index selection for logical
    replication apply for replica identity full. Since PostgreSQL 14, replica
    identity full is able to make use of existing indexes [1][2] (authors in
    CC) when replicating UPDATE or DELETE operations.
    
    Today, when identifying which index to use for the update or delete, the
    first suitable index is chosen by OID order, which generally corresponds to
    creation order. If the chosen index has low cardinality, the lookup may
    perform no better than a sequential scan. While avoiding replica identity
    full is generally recommended, some users need to maintain REPLICA IDENTITY
    FULL to support downstream logical consumers that require full row images.
    These users would also like performant PostgreSQL to PostgreSQL replication.
    
    I propose improving the index selection heuristic to prefer unique indexes,
    favoring those with fewer columns. Previous discussion in the linked
    threads avoided invoking the planner for full index selection; the
    heuristic I propose serves as a middle ground. A unique index guarantees
    that each tuple match requires at most one index scan, and among unique
    indexes, fewer columns means a narrower, more efficient lookup. I have
    attached a patch implementing this check.
    
    
    In addition, I've performed some simple performance testing of this patch:
    
    ```
    CREATE TABLE cardinality_index_test (
        id          uuid NOT NULL DEFAULT gen_random_uuid(),
        is_active   boolean NOT NULL DEFAULT true,
        payload     text NOT NULL DEFAULT repeat('x', 200)
    );
    
    CREATE INDEX idx_bad_low_cardinality ON cardinality_index_test (is_active);
    
    CREATE UNIQUE INDEX idx_good_unique ON cardinality_index_test (id);
    ```
    
    With the patch, the time to replicate 1,000 point updates on a table size
    of 1,000,000 dropped from 53 seconds to under 1 second.
    
    Feedback welcome.
    
    Thanks,
    
    Ethan Mertz
    SDE, Amazon Web Services
    
    [1]
    https://www.postgresql.org/message-id/flat/CACawEhVLqmAAyPXdHEPv1ssU2c%3DdqOniiGz7G73HfyS7%2BnGV4w%40mail.gmail.com
    
    [2]
    https://www.postgresql.org/message-id/flat/TYAPR01MB58669D7414E59664E17A5827F522A%40TYAPR01MB5866.jpnprd01.prod.outlook.com