Thread

  1. Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW

    Adam Brusselback <adambrusselback@gmail.com> — 2026-05-26T18:53:28Z

    Hi Vellaipandiyan, thanks for the review.
    
    > I wonder whether overlapping refreshes could still encounter deadlock
    > scenarios around UPSERT conflicts.
    
    That was definitely a gap. The FOR UPDATE step previously issued:
    
        SELECT 1 FROM mv WHERE (...) FOR UPDATE
    
    with no ORDER BY, so two overlapping refreshes could lock the existing
    rows in different physical orders and deadlock before either one reached
    the next statement.
    
    To fix this, the next patch gives the locking SELECT a deterministic
    ORDER BY on the unique key columns so every refresh acquires row locks
    in the same sequence. The upsert now feeds from an ordered source as
    well.
    
    > The CONCURRENTLY behavior also feels somewhat unintuitive here.
    
    I'll include this change in the next patch. After more thinking about
    it (and some feedback), I decided that going this way seems better. So any
    prior discussion about CONCURRENTLY vs without is now talking about the
    inverse of the current version of the patch.
    
    Additionally, I've fixed a "scope drift" issue where the two different
    implementations
    had inconsistent behavior. A row's non-key column changes in the base
    table, shifting it
    into the WHERE predicate's scope. Previously, this caused a unique
    constraint
    violation for the match_merge path because the old row was invisible to the
    filter
    and never deleted. This was previously handled by the direct_mod path
    properly.
    
    Now, both the direct-modification and match/merge paths resolve this by
    using an INSERT ... ON CONFLICT DO UPDATE step (or DO NOTHING if there
    are no non-key columns) against the arbiter index. This safely updates
    the out-of-scope stale row in place without duplicating the key.
    
    > It may also help to document the intended guarantees around overlapping
    > partial refreshes and concurrent DML on base tables.
    
    Here is my attempt at saying what the implementation actually guarantees;
    I will fit this into the docs in some way. If anyone notices something
    wrong with the below, please speak up.
    
    Concurrent Partial Refresh (direct_mod path):
    The FOR UPDATE / upsert-CTE path takes RowExclusiveLock. Readers are
    never blocked. Neither RowExclusiveLock nor the per-row FOR UPDATE
    locks conflict with a plain SELECT. The refresh issues two statements
    in one transaction: the locking SELECT, then the select/upsert/
    anti-join-delete CTE. Under the default READ COMMITTED, these take
    separate snapshots, but the FOR UPDATE row locks are held to transaction
    end, so they bridge both statements. An overlapping refresh blocks on
    those locks for the full duration of the refreshing transaction, not
    just for the locking SELECT, so the gap between the two statements does
    not expose existing MV rows. The CTE's upsert and delete are a single
    statement, so a reader sees either the pre-refresh or post-refresh
    state of an affected row, never a half-applied one.
    
    Concurrent partial refreshes whose predicates touch overlapping existing
    MV rows are serialized. The second waits on the first's FOR UPDATE
    locks. Refreshes over disjoint row sets run in parallel. Within the
    predicate scope, the MV is made consistent with the query snapshot: rows
    present in the snapshot are upserted, rows that no longer appear are
    deleted via the anti-join.
    
    Rows whose key falls outside the predicate are not touched. Note this
    is keyed on the unique index, not on the predicate. As mentioned with
    the drift fix, if the predicate matches a fresh row whose key collides
    with an existing MV row that does not currently match the predicate,
    ON CONFLICT will update that existing row, and the step-one FOR UPDATE
    will not have locked it. This only arises when the predicate references
    non-key columns. When the predicate ranges over the unique-key columns,
    the colliding row necessarily matches too.
    
    It does not lock the base tables. Base-table DML committed after the
    refresh's snapshot is not reflected, identical to a normal full REFRESH.
    
    SELECT FOR UPDATE only serializes overlapping refreshes covering rows
    that already exist in the MV. Two refreshes that both insert the same
    new logical key are serialized by ON CONFLICT and the unique index, not
    by FOR UPDATE. The outcome is still correct. The last writer wins on
    that key.
    
    The predicate must be non-volatile (enforced) and a usable unique index
    is required (enforced).
    
    The above assumes READ COMMITTED. I haven't thought through how things
    will work with other isolation levels.
    
    Non-Concurrent Partial Refresh (match_merge path):
    The WHERE diff/merge path takes an ExclusiveLock and operates similarly
    to the standard full-refresh diff/merge, but with the diff scope
    restricted to rows matching the predicate. Readers are allowed, writers
    are blocked, and overlapping refreshes are serialized at the table
    level. The main difference from a full concurrent refresh is that its
    final insert uses ON CONFLICT DO UPDATE (just like the direct_mod path)
    specifically to resolve unique key violations caused by rows drifting
    into the predicate's scope.
    
    I will provide a new patch shortly.
    
    Thanks,
    Adam Brusselback