Thread

  1. Re: PoC: pg_dump --filter-data <file> (like Oracle Where Clause on RMAN for specific tables)

    Nikolay Samokhvalov <nik@postgres.ai> — 2025-08-26T15:04:00Z

    On Sat, Aug 23, 2025 at 8:05 AM jian he <jian.universality@gmail.com> wrote:
    
    > hi.
    >
    > I just found this
    >
    > https://www.postgresql.org/message-id/flat/CALAY4q8o00Sg5nemi2AuqNuLvmGLaR6br%2BYrvM41ZuSPjM2Qvg%40mail.gmail.com
    > maybe it's not that very helpful.
    >
    > IMV,
    > pg_dump --option="view_name"
    > is better than
    > pg_dump -d cary --where="test1:a3 = ( select max(aa1) from test2 )" >
    > testdump2
    >
    
    In some cases, we cannot or don't want to create a view. For example, view
    creation might be banned in a project :) to avoid dependency management
    headache. Or we simply might lack permissions for DDL.
    
    Back to the idea of having ability to dump with arbitrary filter – I think
    it's a great idea. COPY supports arbitrary SELECT as input and STDOUT as
    output and this is what we usually use, via psql, but doing so, we lose a
    lot of important mechanics pg_dump has.
    
    At the same time, if we think about approach with views, it brings much
    more than just filtering out some rows -- we can do arbitrary
    transformations including projection, aggregation, and joining.
    
    Ideally, it would be awesome to have a concept of virtual view that would
    be implemented at pg_dump level to support any kind transformation. While
    avoiding the need to have DDL permissions and change schema. This could
    give huge freedom and enable lots of workflows (e.g., for testing –
    replacing some actual sensitive values with random data on the fly would be
    extremely helpful to have!)
    
    Nik