Thread

  1. Add --extra-dependencies and immediate data dumping for pg_dump/pg_upgrade

    Jeevan Chalke <jeevan.chalke@enterprisedb.com> — 2025-12-24T11:44:18Z

    Hello Hackers,
    
    We have identified a dependency issue—most notably observed with the
    PostGIS extension—where a table's column definition relies on data existing
    in another table's catalog at restore time. Because pg_dump typically
    separates schema and data into distinct sections, these implicit data-level
    dependencies are not captured, leading to failures during pg_upgrade or
    pg_restore.
    
    Jakub Wartak previously reported a detailed example of this issue here:
    https://www.postgresql.org/message-id/CAKZiRmwWyh-yGM8Hrvuuo04JiYFy8S4TLM-3Mn-zi9Rfqc744Q%40mail.gmail.com
    
    Following a discussion with Alvaro Herrera, we have developed a patch based
    on his suggestions.
    The Problem
    
    In certain extension-heavy schemas, an object's schema definition cannot be
    created unless another table's data is already populated. Current pg_dump
    logic handles schema-to-schema dependencies via pg_depend, but it lacks a
    mechanism to:
    
       1.
    
       Enforce a specific order for dependencies not recorded in pg_depend.
       2.
    
       Interleave data loading with schema creation for specific tables.
    
    Proposed Solution
    
    This patch introduces a new option, --extra-dependencies, for pg_dump and
    pg_upgrade.
    
    *1. Dependency Hinting:* The option allows users to provide manual
    dependency hints in the format table#referenced_table. Internally, pg_dump
    treats these as "faked" dependencies, ensuring the referenced table is
    sorted before the dependent object in the dump graph, similar to standard
    pg_depend entries.
    
    *2. Immediate Data Dumping:* To satisfy the data-level requirement, the
    patch ensures that any table referenced via this option has its data dumped
    immediately following its definition (utilizing the dumpTableData()
    infrastructure) rather than in the general DATA section of the dump.
    Use Case
    
    While this is a specialized tool intended for unrecorded dependencies (like
    those in PostGIS), it provides a necessary safety valve for migrations that
    currently require manual intervention or complex workarounds.
    
    Attached are the patches (PoC) for review. We look forward to your thoughts
    and suggestions.
    
    Regards,
    
    -- 
    *Jeevan Chalke*
    *Principal Engineer, Engineering Manager*
    *Product Development*
    
    enterprisedb.com <https://www.enterprisedb.com>