Thread

  1. Re: postgres_fdw: Use COPY to speed up batch inserts

    Matheus Alcantara <matheusssilv97@gmail.com> — 2025-11-18T22:13:17Z

    On Mon Nov 17, 2025 at 11:03 PM -03, Masahiko Sawada wrote:
    > IIUC the performance regression occurs when users insert many rows
    > into a foreign table with batch_size = 1 and use_copy_for_insert =
    > true (tps: 133.451518 vs. 132.644801 vs. 88.998804). Since batch_size
    > defaults to 1, users might experience performance issues if they
    > enable use_copy_for_insert without adjusting the batch_size. I'm
    > worried that users could easily find themselves in this situation.
    >
    Yes, you are correct. The 0002 patch aims to reduce this issue by using
    the COPY command only if the use_copy_for_insert = true and if
    batch_size > 1 which will reduce the cases but the regression can still
    happen if the user send a single row to insert into a foreign table.
    
    Inserting a single row into a foreign table using COPY is a bit slower
    compared with using INSERT. See the followinw pgbench results:
    
    (Single row using INSERT)
        tps = 19814.535944
    
    (Single row using COPY)
        tps = 16562.324025
    
    I think that the documentation should mention that just changing
    use_copy_for_insert without also changing the batch_size option could
    cause performance regression.
    
    > One possible solution would be to introduce a threshold, like
    > copy_min_row, which would specify the minimum number of rows needed
    > before switching to the COPY command. However, this would require
    > coordination with batch_size since having copy_min_row lower than
    > batch_size wouldn't make sense.
    >
    The only problem that I see with this approach is that it would make
    EXPLAIN(VERBOSE) and EXPLAIN(ANALYZE, VERBOSE) remote SQL output
    different. The user will never know with EXPLAIN (without analyze) if
    the COPY will be used or not. Is this a problem or I'm being to much
    conservative?
    
    I think that we can do such coordination on postgres_fdw_validator().
    
    Also if we decide to go with this idea it seems to me that we would have
    to much table options to configure to enable the COPY opitimization, we
    would need "copy_min_row", "batch_size" and "use_copy_for_insert". What
    about decide to use the COPY command if use_copy_for_insert = true and
    the number of rows being inserted is >= batch_size? 
    
    > Alternatively, when users are using batch insertion (batch_size > 0),
    > we could use the COPY command only for full batches and fall back to
    > INSERT for partial ones.
    >
    IIUC in this case we would sent COPY and INSERT statements to the
    foreign server for the same execution, for example, if batch_size = 100
    and the user try insert 105 rows into the foreign table we will send a
    COPY statement with 100 rows and then an INSERT with the 5 rows
    remaining? If that's the case which SQL we should show on Remote SQL
    from EXPLAIN(ANALYZE, VERBOSE) output? I think that this can cause some
    confusion.
    
    > BTW I noticed that use_copy_for_insert option doesn't work with COPY
    > FROM command. I got the following error with use_copy_for_insert=true
    > and batch_size=3:
    >
    > postgres(1:2546195)=# copy t from '/tmp/a.csv'; -- table 't' is a foreign table.
    > ERROR:  there is no parameter $1
    > CONTEXT:  remote SQL command: INSERT INTO public.t(c) VALUES ($1)
    > COPY t
    >
    Thanks for testing this case. The problem was that I as checking if the
    COPY can be used inside create_foreign_modify() that is called by
    BeginForeignInsert and also BeginForeignModify() and the COPY can be
    used only by the foreign modify path. To fix this issue I've moved the
    check to postgresBeginForeignModify().
    
    I'm attaching v6 with the following changes:
    - I've squashed 0002 into 0001, so now the COPY will only be used if
      use_copy_for_insert = true and if batch_size > 1
    - Fix for the bug of COPY FROM a foreign table
    - New test case for the COPY bug
    
    --
    Matheus Alcantara
    EDB: http://www.enterprisedb.com