Thread

  1. Re: Patch: dumping tables data in multiple chunks in pg_dump

    Hannu Krosing <hannuk@google.com> — 2025-11-13T20:26:46Z

    The reason for small chunk sizes is that they are determined by main
    heap table, and that was just over 1GB
    
    largetoastdb=> SELECT format('%I.%I', t.schemaname, t.relname) as table_name,
           pg_table_size(t.relid) AS table_size,
           sum(pg_relation_size(i.indexrelid)) AS total_index_size,
           pg_relation_size(t.relid) AS main_table_size,
           pg_relation_size(c.reltoastrelid) AS toast_table_size,
           pg_relation_size(oi.indexrelid) AS toast_index_size,
           t.n_live_tup AS row_count,
           count(*) AS index_count,
           array_to_json(array_agg(json_build_object(i.indexrelid::regclass,
    pg_relation_size(i.indexrelid))), true) AS index_info
      FROM pg_stat_user_tables t
      JOIN pg_stat_user_indexes i ON i.relid = t.relid
      JOIN pg_class c ON c.oid = t.relid
      LEFT JOIN pg_stat_sys_indexes AS oi ON oi.relid = c.reltoastrelid
     GROUP BY 1, 2, 4, 5, 6, 7
     ORDER BY 2 DESC, 7 DESC
     LIMIT 25;
    ┌─[ RECORD 1 ]─────┬─────────────────────────────────────┐
    │ table_name       │ public.just_toasted                 │
    │ table_size       │ 56718835712                         │
    │ total_index_size │ 230064128                           │
    │ main_table_size  │ 1191559168                          │
    │ toast_table_size │ 54613336064                         │
    │ toast_index_size │ 898465792                           │
    │ row_count        │ 5625234                             │
    │ index_count      │ 1                                   │
    │ index_info       │ [{"just_toasted_pkey" : 230064128}] │
    └──────────────────┴─────────────────────────────────────┘
    
    On Thu, Nov 13, 2025 at 9:24 PM Hannu Krosing <hannuk@google.com> wrote:
    >
    > Ran another test with a 53GB database where most of the data is in TOAST
    >
    > CREATE TABLE just_toasted(
    >   id serial primary key,
    >   toasted1 char(2200) STORAGE EXTERNAL,
    >   toasted2 char(2200) STORAGE EXTERNAL,
    >   toasted3 char(2200) STORAGE EXTERNAL,
    >   toasted4 char(2200) STORAGE EXTERNAL
    > );
    >
    > and the toast fields were added in somewhat randomised order.
    >
    > Here the results are as follows
    >
    > Parallelism   |   chunk size (pages)   |    time (sec)
    >  1        |    -         |     240
    >  2        |  1000    |     129
    >  4        |  1000    |      64
    >  8        |  1000    |      36
    > 16       |  1000    |      30
    >
    >  4        |  9095    |      78
    >  8        |  9095    |      42
    > 16       |  9095    |      42
    >
    > The reason larger chunk sizes performed worse was that they often had
    > one or two stragglers left behind which
    >
    > Detailed run results below:
    >
    > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
    > --format=directory -h 10.58.80.2 -U postgres -f
    > /tmp/ltoastdb-1-plain.dump largetoastdb
    > real    3m59.465s
    > user    3m43.304s
    > sys     0m15.844s
    >
    > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
    > --format=directory -h 10.58.80.2 -U postgres
    > --huge-table-chunk-pages=9095 -j 4 -f /tmp/ltoastdb-4.dump
    > largetoastdb
    > real    1m18.320s
    > user    3m49.236s
    > sys     0m19.422s
    >
    > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
    > --format=directory -h 10.58.80.2 -U postgres
    > --huge-table-chunk-pages=9095 -j 8 -f /tmp/ltoastdb-8.dump
    > largetoastdb
    > real    0m42.028s
    > user    3m55.299s
    > sys     0m24.657s
    >
    > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
    > --format=directory -h 10.58.80.2 -U postgres
    > --huge-table-chunk-pages=9095 -j 16 -f /tmp/ltoastdb-16.dump
    > largetoastdb
    > real    0m42.575s
    > user    4m11.011s
    > sys     0m26.110s
    >
    > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
    > --format=directory -h 10.58.80.2 -U postgres
    > --huge-table-chunk-pages=1000 -j 16 -f /tmp/ltoastdb-16-1kpages.dump
    > largetoastdb
    > real    0m29.641s
    > user    6m16.321s
    > sys     0m49.345s
    >
    > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
    > --format=directory -h 10.58.80.2 -U postgres
    > --huge-table-chunk-pages=1000 -j 8 -f /tmp/ltoastdb-8-1kpages.dump
    > largetoastdb
    > real    0m35.685s
    > user    3m58.528s
    > sys     0m26.729s
    >
    > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
    > --format=directory -h 10.58.80.2 -U postgres
    > --huge-table-chunk-pages=1000 -j 4 -f /tmp/ltoastdb-4-1kpages.dump
    > largetoastdb
    > real    1m3.737s
    > user    3m50.251s
    > sys     0m18.507s
    >
    > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
    > --format=directory -h 10.58.80.2 -U postgres
    > --huge-table-chunk-pages=1000 -j 2 -f /tmp/ltoastdb-2-1kpages.dump
    > largetoastdb
    > real    2m8.708s
    > user    3m57.018s
    > sys     0m18.499s
    >
    > On Thu, Nov 13, 2025 at 7:39 PM Hannu Krosing <hannuk@google.com> wrote:
    > >
    > > Going up to 16 workers did not improve performance , but this is
    > > expected, as the disk behind the database can only do 4TB/hour of
    > > reads, which is now the bottleneck. (408/352/*3600 = 4172 GB/h)
    > >
    > > $ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
    > > --huge-table-chunk-pages=131072 -j 16 -f /tmp/parallel16.dump largedb
    > > real    5m44.900s
    > > user    53m50.491s
    > > sys     5m47.602s
    > >
    > > And 4 workers showed near-linear speedup from single worker
    > >
    > > hannuk@pgn2:~/work/postgres/src/bin/pg_dump$ time ./pg_dump
    > > --format=directory -h 10.58.80.2 -U postgres
    > > --huge-table-chunk-pages=131072 -j 4 -f /tmp/parallel4.dump largedb
    > > real    10m32.074s
    > > user    38m54.436s
    > > sys     2m58.216s
    > >
    > > The database runs on a 64vCPU VM with 128GB RAM, so most of the table
    > > will be read in from the disk
    > >
    > >
    > >
    > >
    > >
    > >
    > > On Thu, Nov 13, 2025 at 7:02 PM Hannu Krosing <hannuk@google.com> wrote:
    > > >
    > > > I just ran a test by generating a 408GB table and then dumping it both ways
    > > >
    > > > $ time pg_dump --format=directory -h 10.58.80.2 -U postgres -f
    > > > /tmp/plain.dump largedb
    > > >
    > > > real    39m54.968s
    > > > user    37m21.557s
    > > > sys     2m32.422s
    > > >
    > > > $ time ./pg_dump --format=directory -h 10.58.80.2 -U postgres
    > > > --huge-table-chunk-pages=131072 -j 8 -f /tmp/parallel8.dump largedb
    > > >
    > > > real    5m52.965s
    > > > user    40m27.284s
    > > > sys     3m53.339s
    > > >
    > > > So parallel dump with 8 workers using 1GB (128k pages) chunks runs
    > > > almost 7 times faster than the sequential dump.
    > > >
    > > > this was a table that had no TOAST part. I will run some more tests
    > > > with TOASTed tables next and expect similar or better improvements.
    > > >
    > > >
    > > >
    > > > On Wed, Nov 12, 2025 at 1:59 PM Ashutosh Bapat
    > > > <ashutosh.bapat.oss@gmail.com> wrote:
    > > > >
    > > > > Hi Hannu,
    > > > >
    > > > > On Tue, Nov 11, 2025 at 9:00 PM Hannu Krosing <hannuk@google.com> wrote:
    > > > > >
    > > > > > Attached is a patch that adds the ability to dump table data in multiple chunks.
    > > > > >
    > > > > > Looking for feedback at this point:
    > > > > >  1) what have I missed
    > > > > >  2) should I implement something to avoid single-page chunks
    > > > > >
    > > > > > The flag --huge-table-chunk-pages which tells the directory format
    > > > > > dump to dump tables where the main fork has more pages than this in
    > > > > > multiple chunks of given number of pages,
    > > > > >
    > > > > > The main use case is speeding up parallel dumps in case of one or a
    > > > > > small number of HUGE tables so parts of these can be dumped in
    > > > > > parallel.
    > > > >
    > > > > Have you measured speed up? Can you please share the numbers?
    > > > >
    > > > > --
    > > > > Best Wishes,
    > > > > Ashutosh Bapat