Thread
-
Re: pg_plan_advice
Jakub Wartak <jakub.wartak@enterprisedb.com> — 2025-12-18T12:27:27Z
On Wed, Dec 17, 2025 at 2:44 PM Jakub Wartak <jakub.wartak@enterprisedb.com> wrote: > > On Wed, Dec 17, 2025 at 11:12 AM Jakub Wartak > <jakub.wartak@enterprisedb.com> wrote: > > > > On Mon, Dec 15, 2025 at 9:06 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > > > Here's v7. > > [..] > >[..q20..] > > OK, now for the q10: Hi, this is a follow-up just to the q10. > So to me it looks like in Generated Plan Advice we: > - have proper HASH_JOIN(customer nation) > - but it somehow forgot to include "HASH_JOIN(orders)" to cover for > that Parallel Hash Join on (orders.o_custkey = customer.c_custkey) > with input from NL. After adding that manually, it achieves the same > input plan properly. [..] Well, it's quite a ride with the Q10 and I partially wrong with above: 0. The reported earlier wrong missing "HASH_JOIN(orders customer)" - that part was okay 1. The Incremental Sort is being used in the original plan, but is still IS not reflected in the generated advice. 2a. I've noticed Memoize/Index Scan was not being respected for "nation" 2b. Seq scan for nation was being done for "nation" So total modification list, I've ended up doing (+ for adding , - for removing): + HASH_JOIN(orders customer) -- from earlier reply + NESTED_LOOP_MEMOIZE(nation) + INDEX_SCAN(nation public.pk_nation) - HASH_JOIN(customer nation) -- as it was we were having NL() in org plan SEQ_SCAN(orders customer nation) ==> SEQ_SCAN(orders customer) In full the best shape seems to be Q10 with pg_plan_advice.advice = 'HASH_JOIN(orders customer) JOIN_ORDER(orders lineitem customer nation) NESTED_LOOP_PLAIN(lineitem) SEQ_SCAN(orders customer) INDEX_SCAN(lineitem public.lineitem_l_orderkey_idx_l_returnflag) GATHER_MERGE((customer orders lineitem nation)) NESTED_LOOP_MEMOIZE(nation)'; which yields: Sort Sort Key: (sum((lineitem.l_extendedprice * ('1'::numeric - lineitem.l_discount)))) DESC -> GroupAggregate Group Key: customer.c_custkey, nation.n_name -> Gather Merge Workers Planned: 2 -> Sort Sort Key: customer.c_custkey, nation.n_name -> Nested Loop -> Parallel Hash Join Hash Cond: (orders.o_custkey = customer.c_custkey) -> Nested Loop -> Parallel Seq Scan on orders Filter: ((o_orderdate >= '1993-10-01'::date) AND (o_orderdate < '1994-01-01 00:00:00'::timestamp without time zone)) -> Index Scan using lineitem_l_orderkey_idx_l_returnflag on lineitem Index Cond: (l_orderkey = orders.o_orderkey) -> Parallel Hash -> Parallel Seq Scan on customer -> Memoize Cache Key: customer.c_nationkey Cache Mode: logical -> Index Scan using pk_nation on nation Index Cond: (n_nationkey = customer.c_nationkey) but that Incremental Sort *is* still missing. In original plan we are doing Incremental Sort (Sort Key: customer.c_custkey, nation.n_name, Presorted Key: customer.c_custkey) <-- .... Sort(Sort Key: customer.c_custkey) However, even with my overrides I haven't found an immediately obvious way to force it to use Incremental Sort on a specific field, so it just sorts on two at once. Maybe it's something that should be expressed through GATHER_MERGE()?, but that's not obvious how and where. In terms of raw performance , it seems to be very similiar (98ms +/- 8ms even between those two). -J.