Re: [PATCH] Optionally record Plan IDs to track plan changes for a query

Андрей Казачков <andrey.kazachkov@tantorlabs.ru>

From: Андрей Казачков <andrey.kazachkov@tantorlabs.ru>
To: Sami Imseih <samimseih@gmail.com>, Lukas Fittl <lukas@fittl.com>, Michael Paquier <michael@paquier.xyz>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Marko M <marko@pganalyze.com>
Date: 2025-12-25T14:33:11Z
Lists: pgsql-hackers

Attachments

  
    Dear PostgreSQL Hackers,
  
  
     
  
  
    I’ve been testing the proposed v5 plan id work and found out instability of
  
  
    computing the plan identifier after feeding different query texts that
  
  
    produces the same physical plan trees but with different plan ids. The main
  
  
    pattern regards with fields of Plan node structures that depend on positions of
  
  
    RTEs in a RTE list.
  
  
     
  
  
    For your convenience, I’ve attached minimal reproducible examples that
  
  
    demonstrate this pattern. Both are based on the v6-0001 patch, which
  
  
    incorporates v5-0001..v5-0003 changes and fixes jumbling empty arrays.
  
  
     
  
  
    ```
  
  
    create table foo1 (num int);
  
  
    create table foo2 (num int);
  
  
     
  
  
    insert into foo1 (num)
  
  
    select *
  
  
    from generate_series(1, 1000);
  
  
     
  
  
    insert into foo2 (num)
  
  
    select *
  
  
    from generate_series(1, 10);
  
  
     
  
  
    analyze foo1;
  
  
    analyze foo2;
  
  
     
  
  
    set compute_plan_id to true;
  
  
    ```
  
  
     
  
  
    The first example changes join order between two tables:
  
  
    ```
  
  
    explain (costs off, verbose)
  
  
    select 1 from foo1 join foo2 on foo1.num = foo2.num;
  
  
     
  
  
                QUERY PLAN
  
  
    -------------------------------------
  
  
     Hash Join
  
  
       Output: 1
  
  
       Hash Cond: (foo1.num = foo2.num)
  
  
       ->  Seq Scan on public.foo1
  
  
             Output: foo1.num
  
  
       ->  Hash
  
  
             Output: foo2.num
  
  
             ->  Seq Scan on public.foo2
  
  
                   Output: foo2.num
  
  
     Plan Identifier: 538643160186222168
  
  
     
  
  
    explain (costs off, verbose)
  
  
    select 1 from foo2 join foo1 on foo1.num = foo2.num;
  
  
     
  
  
                 QUERY PLAN
  
  
    --------------------------------------
  
  
     Hash Join
  
  
       Output: 1
  
  
       Hash Cond: (foo1.num = foo2.num)
  
  
       ->  Seq Scan on public.foo1
  
  
             Output: foo1.num
  
  
       ->  Hash
  
  
             Output: foo2.num
  
  
             ->  Seq Scan on public.foo2
  
  
                   Output: foo2.num
  
  
     Plan Identifier: -953143034841089498
  
  
    ```
  
  
    Here the reordering of relations in the JOIN operator changes their order in
  
  
    the RTE list, which in turn changes RTE position-sensitive fields like `varno`
  
  
    in Vars, and those differences leak into jumbling.
  
  
     
  
  
    The second example assumes transformation of some table expressions to a
  
  
    new form like CTE-inlining or completely its eliminating. Let's see example:
  
  
    ```
  
  
    explain (costs off, verbose)
  
  
    WITH foo_cte as (
  
  
      SELECT num FROM foo1)
  
  
    select * from foo_cte;
  
  
     
  
  
                  QUERY PLAN
  
  
    --------------------------------------
  
  
     Seq Scan on public.foo1
  
  
       Output: foo1.num
  
  
     Plan Identifier: 3494394630757173099
  
  
     
  
  
    explain (costs off, verbose)
  
  
    select * from foo1;
  
  
     
  
  
                  QUERY PLAN
  
  
    --------------------------------------
  
  
     Seq Scan on public.foo1
  
  
       Output: num
  
  
     Plan Identifier: 8116143677260771228
  
  
    ```
  
  
    In the example with a CTE, RTE list contains the outdated subquery item and the
  
  
    relation one, whereas without a CTE we have just the relation item. Although
  
  
    the CTE is inlined, its RTE is not removed from the rtable; as a result,
  
  
    position-sensitive fields (such as `varno` for Vars) differ from the no-CTE
  
  
    case.
  
  
     
  
  
    A possible way to address it during jumbling process:
  
  
    1. Remove/skip unused RTE list elements.
  
  
    2. Sort active RTE list elements by some stable criteria.
  
  
    3. Adjust fields referring to active RTEs.
  
  
     
  
  
    But the main challenge is identifying all “position-sensitive” fields across
  
  
    node types efficiently and maintainably. I’d happy to see your feedback on
  
  
    this issue.
  
  
     
  
  
    Additionally, I noticed that the `location` field is being jumbled for several
  
  
    structures (for example, `Var`). Since it’s only a token location, I believe we
  
  
    should not include it in the final plan id value.
  
  
     
  
  
    --
  
  
    Sincerely,
  
  
    Andrey Kazachkov