Thread

Re: pg_plan_advice

Jakub Wartak <jakub.wartak@enterprisedb.com> — 2025-10-31T09:58:59Z
On Thu, Oct 30, 2025 at 3:00 PM Robert Haas <robertmhaas@gmail.com> wrote:

[..over 400kB of attachments, yay]

Thank You for working on this!

My gcc-13 was nitpicking a little bit (see
compilation_warnings_v1.txt), so attached is just a tiny diff to fix
some of those issues. After that, clang-20 run was clean too.

> First, any form of user control over the planner tends to be a lightning rod for criticism around here.

I do not know where this is coming from, but everybody I've talked to
was saying this is needed to handle real enterprise databases and
applications. I just really love it, how one could precisely adjust
the plan with this even with the presence of heavy aliasing:

postgres=# explain (plan_advice, costs off) SELECT * FROM (select *
from t1 a join t2 b using (id)) a, t2 b, t3 c WHERE a.id = b.id and
b.id = c.id;
                     QUERY PLAN
-----------------------------------------------------
 Merge Join
   Merge Cond: (a.id = c.id)
   ->  Merge Join
         Merge Cond: (a.id = b.id)
         ->  Index Scan using t1_pkey on t1 a
         ->  Index Scan using t2_pkey on t2 b
   ->  Sort
         Sort Key: c.id
         ->  Seq Scan on t3 c
 Supplied Plan Advice:
   SEQ_SCAN(ble5) /* not matched */
 Generated Plan Advice:
   JOIN_ORDER(a#2 b#2 c)
   MERGE_JOIN_PLAIN(b#2 c)
   SEQ_SCAN(c)
   INDEX_SCAN(a#2 public.t1_pkey )
   NO_GATHER(c a#2 b#2)
(17 rows)

postgres=# set pg_plan_advice.advice = 'SEQ_SCAN(b#2)';
SET
postgres=# explain (plan_advice, costs off) SELECT * FROM (select *
from t1 a join t2 b using (id)) a, t2 b, t3 c WHERE a.id = b.id and
b.id = c.id;
                     QUERY PLAN
----------------------------------------------------
 Hash Join
   Hash Cond: (b.id = a.id)
   ->  Seq Scan on t2 b
   ->  Hash
         ->  Merge Join
               Merge Cond: (a.id = c.id)
               ->  Index Scan using t1_pkey on t1 a
               ->  Sort
                     Sort Key: c.id
                     ->  Seq Scan on t3 c
 Supplied Plan Advice:
   SEQ_SCAN(b#2) /* matched */
 Generated Plan Advice:
   JOIN_ORDER(b#2 (a#2 c))
   MERGE_JOIN_PLAIN(c)
   HASH_JOIN(c)
   SEQ_SCAN(b#2 c)
   INDEX_SCAN(a#2 public.t1_pkey)
   NO_GATHER(c a#2 b#2)

To attract a little attention to the $thread, the only bigger design
(usability) question that keeps ringing in my head is how we are going
to bind it to specific queries without even issuing any SETs(or ALTER
USER) in the far future in the grand scheme of things. The discussed
query id (hash), full query text comparison, maybe even strstr(query ,
"partial hit") or regex all seem to be kind too limited in terms of
what crazy ORMs can come up with (each query will be potentially
slightly different, but if optimizer reference points are stable that
should nail it good enough, but just enabling it for the very specific
set of queries and not the others [with same aliases] is some major
challenge).

Due to this, at some point I was even thinking about some hashes for
every plan node (including hashes of subplans), e.g.:
 Merge Join // hash(MERGE_JOIN_PLAIN(b#2) + ';' somehashval1 + ';'+
somehahsval2 ) => somehashval3
   Merge Cond: (a.id = c.id)
   ->  Merge Join
         Merge Cond: (a.id = b.id)
         ->  Index Scan using t1_pkey on t1 a // hash(INDEX_SCAN(a#2
public.t1_pkey)) => somehashval1
         ->  Index Scan using t2_pkey on t2 b // hash(INDEX_SCAN(b#2
public.t2_pkey)) => somehashval2

and then having a way to use `somehashval3` (let's say it's SHA1) as a
way to activate the necessary advice. Something like having a way to
express it using plan_advice.on_subplanhashes_plan_advice =
'somehashval3: SEQ_SCAN(b#2)'. This would have the benefit of being
able to override multiple similiar SQL queries in one go rather than
collecting all possible query_ids, but probably it's stupid,
heavyweight, but that would be my dream ;)

-J.