Thread

Re: Proposal - Allow extensions to set a Plan Identifier

Lukas Fittl <lukas@fittl.com> — 2025-03-25T07:45:28Z
On Tue, Mar 25, 2025 at 12:13 AM Michael Paquier <michael@paquier.xyz>
wrote:

> On Mon, Mar 24, 2025 at 06:47:59PM -0500, Sami Imseih wrote:
> > I know it was mentioned above by both Michael and Andrei that
> > AppendJumble should not be exposed. I am not sure I agree with
> > that. I think it should be exposed, along with
> > JUMBLE_FIELD, JUMBLE_FIELD_SINGLE and JUMBLE_STRING
> > and _jumbleList.
> >
> > I am afraid that if we don't expose these routines/macros, the
> > extension will have
> > to re-invent those wheels. This is not included in v1-0001, but If
> > there is no objection,
> > I can update the patch. Thoughts?
>
> Hmm, yes, the macros could prove to be useful to allow extensions to
> compile their own code the manipulations they want to do on the node
> structures they'd like to touch, but wouldn't that mean that they
> would copy in some way a portion of what gen_node_support.pl does?
>

I agree with Sami that we should expose AppendJumble (it'd be necessary for
any extension that wants to re-use the jumbling logic), and I don't see a
reason to skip over the convenience macros, but also not hard to recreate
those. AFAIK you could get by without having access to _jumbleList, since
you can just call JumbleNode which can jumble lists.

The initial patch I had posted over at the other thread [0], (patch 0003 to
be precise in the initial set, which was partially based on work Sami had
done previously), showed what I think is the minimum we should enable:

case T_IndexScan:
{
  IndexScan  *splan = (IndexScan *) plan;
  ...
  JUMBLE_VALUE(splan->indexid);
  JumbleNode(jstate, (Node *) splan->indexqual);
  ...

i.e. allow someone to defer to core logic for expressions, but require them
to implement their own (manual) way of writing the jumble
functions/conditions for the more limited set of expression-containing
nodes they care about (like plan nodes).

Of course, thinking about other use cases, I could imagine someone would
potentially be interested to change core jumbling logic for only a specific
node (e.g. implementing a query ID that considers constant values to be
significant), but I'd argue that making that level of customization work is
out of scope / hard to do in general.

Perhaps we should think more carefully about this part, and refactor
> this script to work as a perl module so as it could be reused by some
> external code, at least for the parsing of the headers and the
> attributes assigned to each nodes and their attributes?
>

I've been thinking about how to rework things for a PG18-requiring
pg_stat_plans extension I'd like to publish, and if I were to choose a node
attribute-based approach I'd probably:

1. Check out the upstream Postgres source for the given version I'm
generating jumble code for
2. Modify the headers as needed to have the node attributes I want
3. Call the gen_node_support.pl via the build process
4. Copy out the resulting jumble node funcs/conds

Even with the state currently committed this is already possible, but (4)
ends up with a lot of duplicated code in the extension. Assuming we have a
way to call jumbleNode + AppendJumble and macros, that step could only keep
the jumble conds/funcs that are not defined in core.

So based on that workflow I'm not sure turning gen_node_support.pl into a
re-usable module would help, but that's just my perspective without having
built this out (yet).

Thanks,
Lukas

[0]:
https://www.postgresql.org/message-id/flat/CAP53Pkyow59ajFMHGpmb1BK9WHDypaWtUsS_5DoYUEfsa_Hktg%40mail.gmail.com

-- 
Lukas Fittl