Thread

  1. Inserting heap tuples in bulk in COPY

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-08-12T19:16:50Z

    COPY is slow. Let's make it faster. One obvious optimization is to 
    insert heap tuples in bigger chunks, instead of calling heap_insert() 
    separately for every tuple. That saves the overhead of pinning and 
    locking the buffer for every tuple, and you only need to write one WAL 
    record for all the tuples written to the same page, instead of one for 
    each tuple.
    
    Attached is a WIP patch to do that. It adds a new function, 
    heap_multi_insert, which does the same thing as heap_insert, but works 
    in bulk. It takes an array of tuples as argument, and tries to cram as 
    many of them into the chosen targe page as it can, and only writes a 
    single WAL record of the operation.
    
    This gives a significant speedup to COPY, particularly for narrow 
    tables, with small tuples. Grouping multiple tuples into one WAL record 
    reduces the WAL volume significantly, and the time spent in writing that 
    WAL. The reduced overhead of repeatedly locking the buffer is also most 
    noticeable on narrow tables. On wider tables, the effects are smaller. 
    See copytest-results.txt, containing test results with three tables of 
    different widths. The scripts used to get those numbers are also attached.
    
    Triggers complicate this. I believe it is only safe to group tuples 
    together like this if the table has no triggers. A BEFORE ROW trigger 
    might run a SELECT on the table being copied to, and check if some of 
    the tuples we're about to insert exist. If we run BEFORE ROW triggers 
    for a bunch of tuples first, and only then insert them, none of the 
    trigger invocations will see the other rows as inserted yet. Similarly, 
    if we run AFTER ROW triggers after inserting a bunch of tuples, the 
    trigger for each of the insertions would see all the inserted rows. So 
    at least for now, the patch simply falls back to inserting one row at a 
    time if there are any triggers on the table.
    
    The patch is WIP, mainly because I didn't write the WAL replay routines 
    yet, but please let me know if you see any issues.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com