Thread
-
Re: pg_plan_advice
Jakub Wartak <jakub.wartak@enterprisedb.com> — 2025-10-31T09:58:59Z
On Thu, Oct 30, 2025 at 3:00 PM Robert Haas <robertmhaas@gmail.com> wrote: [..over 400kB of attachments, yay] Thank You for working on this! My gcc-13 was nitpicking a little bit (see compilation_warnings_v1.txt), so attached is just a tiny diff to fix some of those issues. After that, clang-20 run was clean too. > First, any form of user control over the planner tends to be a lightning rod for criticism around here. I do not know where this is coming from, but everybody I've talked to was saying this is needed to handle real enterprise databases and applications. I just really love it, how one could precisely adjust the plan with this even with the presence of heavy aliasing: postgres=# explain (plan_advice, costs off) SELECT * FROM (select * from t1 a join t2 b using (id)) a, t2 b, t3 c WHERE a.id = b.id and b.id = c.id; QUERY PLAN ----------------------------------------------------- Merge Join Merge Cond: (a.id = c.id) -> Merge Join Merge Cond: (a.id = b.id) -> Index Scan using t1_pkey on t1 a -> Index Scan using t2_pkey on t2 b -> Sort Sort Key: c.id -> Seq Scan on t3 c Supplied Plan Advice: SEQ_SCAN(ble5) /* not matched */ Generated Plan Advice: JOIN_ORDER(a#2 b#2 c) MERGE_JOIN_PLAIN(b#2 c) SEQ_SCAN(c) INDEX_SCAN(a#2 public.t1_pkey ) NO_GATHER(c a#2 b#2) (17 rows) postgres=# set pg_plan_advice.advice = 'SEQ_SCAN(b#2)'; SET postgres=# explain (plan_advice, costs off) SELECT * FROM (select * from t1 a join t2 b using (id)) a, t2 b, t3 c WHERE a.id = b.id and b.id = c.id; QUERY PLAN ---------------------------------------------------- Hash Join Hash Cond: (b.id = a.id) -> Seq Scan on t2 b -> Hash -> Merge Join Merge Cond: (a.id = c.id) -> Index Scan using t1_pkey on t1 a -> Sort Sort Key: c.id -> Seq Scan on t3 c Supplied Plan Advice: SEQ_SCAN(b#2) /* matched */ Generated Plan Advice: JOIN_ORDER(b#2 (a#2 c)) MERGE_JOIN_PLAIN(c) HASH_JOIN(c) SEQ_SCAN(b#2 c) INDEX_SCAN(a#2 public.t1_pkey) NO_GATHER(c a#2 b#2) To attract a little attention to the $thread, the only bigger design (usability) question that keeps ringing in my head is how we are going to bind it to specific queries without even issuing any SETs(or ALTER USER) in the far future in the grand scheme of things. The discussed query id (hash), full query text comparison, maybe even strstr(query , "partial hit") or regex all seem to be kind too limited in terms of what crazy ORMs can come up with (each query will be potentially slightly different, but if optimizer reference points are stable that should nail it good enough, but just enabling it for the very specific set of queries and not the others [with same aliases] is some major challenge). Due to this, at some point I was even thinking about some hashes for every plan node (including hashes of subplans), e.g.: Merge Join // hash(MERGE_JOIN_PLAIN(b#2) + ';' somehashval1 + ';'+ somehahsval2 ) => somehashval3 Merge Cond: (a.id = c.id) -> Merge Join Merge Cond: (a.id = b.id) -> Index Scan using t1_pkey on t1 a // hash(INDEX_SCAN(a#2 public.t1_pkey)) => somehashval1 -> Index Scan using t2_pkey on t2 b // hash(INDEX_SCAN(b#2 public.t2_pkey)) => somehashval2 and then having a way to use `somehashval3` (let's say it's SHA1) as a way to activate the necessary advice. Something like having a way to express it using plan_advice.on_subplanhashes_plan_advice = 'somehashval3: SEQ_SCAN(b#2)'. This would have the benefit of being able to override multiple similiar SQL queries in one go rather than collecting all possible query_ids, but probably it's stupid, heavyweight, but that would be my dream ;) -J.