Thread

  1. Re: Avoid calling SetMatViewPopulatedState if possible

    ZizhuanLiu X-MAN <44973863@qq.com> — 2026-05-28T04:31:34Z

    HI, ChangAo Chen, David Geier , hackers&nbsp;
    
    
    &gt;&gt; But refreshing a materialized view doesn't only change relispopulated
    &gt;&gt; but also columns like relfilenode, relpages, relhasindex, etc. Doesn't
    &gt;&gt; changing these columns during REFRESH MATERIALIZED VIEW make your
    &gt;&gt; optimization applicable in a lot less cases&nbsp;
    &gt;
    &gt;I don't think so. If we can skip SetMatViewPopulatedState(), we avoid generating
    &gt;a dead pg_class tuple in all cases.
    &gt;
    &gt;&gt; I'm actually wondering why it works at all, even in the example you
    &gt;&gt; gave. Because I thought that even when nothing has changed the pg_class
    &gt;&gt; row is updated for more columns than just relispopulated.
    &gt;
    &gt;"refresh materialized view concurrently" is done by doing DELETE + INSERT
    &gt;to the matview directly, so only relispopulated will change before the patch.
    &gt;After the patch, the pg_class row don't change anymore.
     
    
    
    1.Regarding pg_class.relispopulated
    This column was introduced in PostgreSQL 9.3 specifically for materialized views.&nbsp;
    Conceptually, the state of "IsScannable" is equivalent to and fully dependent on relispopulated.
    This flag(relispopulated) is also used across multiple features, including COPY/pg_repack/pg_dump.
    For the definitions of the populated and scannable states of materialized views, please refer to the official documentation:
    &nbsp; &nbsp; https://www.postgresql.org/docs/devel/sql-creatematerializedview.html
    &nbsp; &nbsp; https://www.postgresql.org/docs/devel/sql-refreshmaterializedview.html
    
    
    2.Current status
    As ChangAo Chen&nbsp;mentioned and verified via tests, and based on my own research and testing,
    I agree with this conclusion:
    &nbsp; &nbsp; If pg_class.relispopulated does not actually change, there is *no need* to invoke SetMatViewPopulatedState().
    Skipping this call will also avoid executing the subsequent call chain:
    &nbsp; &nbsp; CatalogTupleUpdate() -&gt; simple_heap_update() -&gt; heap_update().
    &nbsp; &nbsp; This helps prevent dead tuples from being generated, and also reduces overhead on underlying mechanisms
    &nbsp; &nbsp; including locks, transactions, WAL (xlog), as well as IPC contention and synchronization.&nbsp;
    
    
    3.Regarding updates to attributes in pg_class
    &gt; But refreshing a materialized view doesn't only change relispopulated
    &gt; but also columns like relfilenode, relpages, relhasindex, etc. Doesn't
    &gt; changing these columns during REFRESH MATERIALIZED VIEW make your
    &gt; optimization applicable in a lot less cases?
    To clarify: SetMatViewPopulatedState() is dedicated solely to modifying relispopulated.&nbsp;
    Columns such as relfilenode, relpages and relhasindex are updated by other separate routines.
    
    
    4. About SetMatViewPopulatedState() and its caller
    There are two possible ways to implement this optimization:
    &nbsp; &nbsp; --The first approach is to refactor SetMatViewPopulatedState() to handle the check internally:
    &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;If ((Form_pg_class) GETSTRUCT(tuple))-&gt;relispopulated != newstate, then perform the actual update; otherwise, do nothing.
    &nbsp; &nbsp; --The second approach is to perform the check inside the caller RefreshMatViewByOid(), as implemented in ChangAo Chen’s patch.
    &nbsp; &nbsp; &nbsp; &nbsp; I also noticed that intorel_startup() in createas.c follows the same design pattern — letting the caller handle the check before invoking the callee.&nbsp;
    
    
    5. A small suggestion
    After further consideration, I suggest using the condition:
    &nbsp; &nbsp; if (RelationIsPopulated(matviewRel) != !skipData)
    This is much more readable and intuitive compared to the original:
    &nbsp; &nbsp; if (RelationIsPopulated(matviewRel) == skipData)
    Especially when paired with the original call:
    &nbsp; &nbsp; SetMatViewPopulatedState(matviewRel, !skipData);
    
    
    Additionally, avoiding unnecessary updates and thus preventing dead tuples is especially valuable in BI/OLAP environments that use many materialized views and refresh them frequently (daily or hourly).
    
    
    6. About Testing
    --The new patch passes all PostgreSQL regression tests:
    &nbsp; xman@xman_lzz:~/patch_1/build/testrun/regress/regress$ grep -n matview regression.out
    128:# parallel group (20 tests): &nbsp;init_privs drop_operator security_label password lock object_address tablesample collate groupingsets replica_identity spgist gin matview identity gist generated_stored rowsecurity brin join_hash privileges
    137:ok 126 &nbsp; &nbsp; &nbsp; + matview &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 47396 ms
    xman@xman_lzz:~/patch_1/build/testrun/regress/regress$
    
    
    --I have also checked the output files and expected files for the materialized view regression tests, and they are identical:
    &nbsp; xman@xman_lzz:~$ ls -l /home/xman/patch_1/postgres/src/test/regress/expected/matview.out
    -rw-rw-r-- 1 xman xman 24946 May 25 22:29 /home/xman/patch_1/postgres/src/test/regress/expected/matview.out
    xman@xman_lzz:~$ ls -l /home/xman/patch_1/build/testrun/regress/regress/results/matview.out
    -rw-rw-r-- 1 xman xman 24946 May 27 23:05 /home/xman/patch_1/build/testrun/regress/regress/results/matview.out
    xman@xman_lzz:~$ diff /home/xman/patch_1/postgres/src/test/regress/expected/matview.out /home/xman/patch_1/build/testrun/regress/regress/results/matview.out
    xman@xman_lzz:~$ echo $?
    0
    xman@xman_lzz:~$&nbsp;
    
    
    
    
    The above is what I have found so far. Please feel free to correct me if I have any misunderstandings.
    Thanks,
    --
    Zizhuan Liu&nbsp;(X-MAN)&nbsp;
    44973863@QQ.COM