Re: Assorted improvements in pg_dump

Justin Pryzby <pryzby@telsasoft.com>

From: Justin Pryzby <pryzby@telsasoft.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Hans Buschmann <buschmann@nidsa.net>, pgsql-hackers@postgresql.org, Andres Freund <andres@anarazel.de>
Date: 2021-10-24T22:03:37Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. pg_dump: avoid unsafe function calls in getPolicies().

  2. Postpone calls of unsafe server-side functions in pg_dump.

  3. Account for TOAST data while scheduling parallel dumps.

  4. Use PREPARE/EXECUTE for repetitive per-object queries in pg_dump.

  5. Avoid per-object queries in performance-critical paths in pg_dump.

  6. Rethink pg_dump's handling of object ACLs.

  7. Refactor pg_dump's tracking of object components to be dumped.

  8. pg_dump: fix mis-dumping of non-global default privileges.

On Sun, Oct 24, 2021 at 05:10:55PM -0400, Tom Lane wrote:
> 0003 is the same except I added a missing free().
> 
> 0004 is a new patch based on an idea from Andres Freund [1]:
> in the functions that repetitively issue the same query against
> different tables, issue just one query and use a WHERE clause
> to restrict the output to the tables we care about.  I was
> skeptical about this to start with, but it turns out to be
> quite a spectacular win.  On my machine, the time to pg_dump
> the regression database (with "-s") drops from 0.91 seconds
> to 0.39 seconds.  For a database with 10000 toy tables, the
> time drops from 18.1 seconds to 2.3 seconds.

+               if (tbloids->len > 1)                                                                                                                                                                          
+                       appendPQExpBufferChar(tbloids, ',');                                                                                                                                                   
+               appendPQExpBuffer(tbloids, "%u", tbinfo->dobj.catId.oid);                                                                                                                                      

I think this should say 

+               if (tbloids->len > 0)                                                                                                                                                                          

That doesn't matter much since catalogs aren't dumped as such, and we tend to
count in base 10 and not base 10000.

BTW, the ACL patch makes the overhead 6x lower (6.9sec vs 1.2sec) for pg_dump -t
of a single, small table.  Thanks for that.

-- 
Justin