Re: proposal: schema variables

Pavel Stehule <pavel.stehule@gmail.com>

From: Pavel Stehule <pavel.stehule@gmail.com>
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: Erik Rijkers <er@xs4all.nl>, Michael Paquier <michael@paquier.xyz>, Amit Kapila <amit.kapila16@gmail.com>, DUVAL REMI <REMI.DUVAL@cheops.fr>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2024-08-29T17:33:46Z
Lists: pgsql-hackers, pgsql-performance

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Move WAL sequence code into its own file

  2. Add ExplainState argument to pg_plan_query() and planner().

  3. Don't include access/htup_details.h in executor/tuptable.h

  4. Refactor to avoid code duplication in transformPLAssignStmt.

  5. Avoid including commands/dbcommands.h in so many places

  6. Restrict psql meta-commands in plain-text dumps.

  7. Split func.sgml into more manageable pieces

  8. Fix squashing algorithm for query texts

  9. EXPLAIN: Always use two fractional digits for row counts.

  10. Preliminary refactoring of plpgsql expression construction.

  11. plpgsql: pure parser and reentrant scanner

  12. Add some sanity checks in executor for query ID reporting

  13. Fix misleading error message context

  14. Add macros for looping through a List without a ListCell.

Attachments

út 27. 8. 2024 v 8:52 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
napsal:

> Hi
>
> út 27. 8. 2024 v 8:15 odesílatel Laurenz Albe <laurenz.albe@cybertec.at>
> napsal:
>
>> On Thu, 2024-08-15 at 07:55 +0200, Pavel Stehule wrote:
>> > út 30. 7. 2024 v 21:46 odesílatel Laurenz Albe <
>> laurenz.albe@cybertec.at> napsal:
>> > > - A general reminder: single line comments should start with a lower
>> case
>> > >   letter and have to period in the end:
>> >
>> > Should it be "have not to period in the end" ?
>>
>> I made a typo.  I mean "have *no* period in the end".
>>
>> > I fixed almost all without parts related to psql and executor - there
>> almost all
>> > current comments break the mentioned rule. So I use the style used in
>> these files.
>> > I can fix it (if you like it) - or it can be fixed generally in a
>> separated patch set.
>>
>> Thanks.  I also noticed that a lot of existing comments break that rule,
>> so I think that it is fine if you stick with the way that the surrounding
>> code does it.
>>
>> > > - Variable as parameter:
>> > >
>> > >   CREATE VARIABLE var AS date;
>> > >   LET var = current_date;
>> > >   PREPARE stmt(date) AS SELECT $1;
>> > >   EXECUTE stmt(var);
>> > >   ERROR:  paramid of PARAM_VARIABLE param is out of range
>> > >
>> > >   Is that working as intended?  I don't understand the error message.
>> >
>> > fixed in 0002 patch (variables cannot be used as EXECUTE argument) and
>> 0014 (enable usage variables as an argument of EXECUTE)
>>
>> Thanks.
>>
>> > >   > --- a/src/backend/catalog/namespace.c
>> > >   > +++ b/src/backend/catalog/namespace.c
>> > >   > [...]
>> > >   > +/*
>> > >   > + * IdentifyVariable - try to find variable identified by list of
>> names.
>> > >   > [...]
>> > >
>> > >   Perhaps part of the reason why this function is so complicated is
>> that
>> > >   you support things like this:
>> > >
>> > >     CREATE SCHEMA sch;
>> > >     CREATE TYPE cp AS (a integer, b integer);
>> > >     CREATE VARIABLE sch.v AS cp;
>> > >     LET sch.v = (1, 2);
>> > >     SELECT sch.v.b;
>> > >      b
>> > >     ═══
>> > >      2
>> > >     (1 row)
>> > >
>> > >     test=# SELECT test.sch.v.b;
>> > >      b
>> > >     ═══
>> > >      2
>> > >     (1 row)
>> > >
>> > >   We don't support that for tables:
>> > >
>> > >     CREATE TABLE tab (c cp);
>> > >     INSERT INTO tab VALUES (ROW(42, 43));
>> > >     SELECT tab.c.b FROM tab;
>> > >     ERROR:  cross-database references are not implemented: tab.c.b
>> > >
>> > >   You have to use extra parentheses:
>> > >
>> > >     SELECT (tab.c).b FROM tab;
>> > >      b
>> > >     ════
>> > >      43
>> > >     (1 row)
>> > >
>> > >   I'd say that this should be the same for session variables.
>> > >   What do you think?
>> >
>> > I prefer the current state, but I don't have a very strong opinion
>> about it.
>> > It can save 115 lines of almost trivial code, but I think the user
>> experience
>> > can be much worse.  Using composite types in tables is not too common a
>> > pattern (and when I read some recommendations for Oracle, it is an
>> antipattern),
>> > but usage of composite variables is common (it can hold a row).
>> Moreover,
>> > we talked long about possible identifier's collisions, and the pattern
>> > schema.variable is very good protection against possible collisions.
>> > Probably the pattern catalog.schema.variable.field is not too
>> interesting
>> > but the support has 50 lines.
>> >
>> > More, the plpgsql supports pattern label.variable.field, so can be
>> little bit
>> > unfriendly if session variables doesn't support "similar" pattern
>>
>> I see your point, and I'm fine with leaving it as it is.
>>
>> >
>> >
>> > > - src/backend/commands/session_variable.c
>> > >
>> > >   There is one comment that confuses me and that I did not edit:
>> > >
>> > >   > +Datum
>> > >   > +GetSessionVariable(Oid varid, bool *isNull, Oid *typid)
>> > >   > +{
>> > >   > +   SVariable   svar;
>> > >   > +
>> > >   > +   svar = get_session_variable(varid);
>> > >   > +
>> > >   > +   /*
>> > >   > +    * Although svar is freshly validated in this point, the
>> svar->is_valid can
>> > >   > +    * be false, due possible accepting invalidation message
>> inside domain
>> > >   > +    * check. Now, the validation is done after lock, that can
>> also accept
>> > >   > +    * invalidation message, so validation should be trustful.
>> > >   > +    *
>> > >   > +    * For now, we don't need to repeat validation. Only svar
>> should be valid
>> > >   > +    * pointer.
>> > >   > +    */
>> >
>> > This comment is related to assertions. Before I had there
>> `Assert(svar->is_valid)`,
>> > because I expected it. But it was not always true. And although it is
>> true,
>> > we don't need to validate a variable, because at this moment, the
>> variable
>> > should be locked, and then we can return content safely.
>>
>> I guess my main problem is the word "trustful".  I don't recognize that
>> word.
>> Perhaps you can reword the comment along the lines of your above
>> explanation.
>>
>
> I'll try to change it
>

is this better

<-->/*
<--> * Although svar is freshly validated in this point, the svar->is_valid
can
<--> * be false, due possible accepting invalidation message inside domain
<--> * check. But now, the variable, and all dependencies are locked, so we
<--> * don't need to repeat validation.
<--> */


>
>
>>
>> >
>> > > - src/backend/executor/execMain.c
>> > >
>> > >   > @@ -196,6 +198,51 @@ standard_ExecutorStart(QueryDesc *queryDesc,
>> int eflags)
>> > >   >     Assert(queryDesc->sourceText != NULL);
>> > >   >     estate->es_sourceText = queryDesc->sourceText;
>> > >   >
>> > >   > +   /*
>> > >   > +    * The executor doesn't work with session variables directly.
>> Values of
>> > >   > +    * related session variables are copied to dedicated array,
>> and this array
>> > >   > +    * is passed to executor.
>> > >   > +    */
>> > >
>> > >   Why?  Is that a performance measure?  The comment should explain
>> that.
>> >
>> > Session variables are volatile internally - some function that uses LET
>> statements
>> > can be called more times inside a query. This behavior is not
>> consistent with
>> > plpgsql's variables or external parameters. And if we want to support
>> parallel
>> > queries, then volatile session variables can be pretty messy (and
>> unusable).
>> > If we want the same behaviour for queries with or without parallel
>> support,
>> > then the session variables should be "stable" (like stable functions).
>> > Simple implementation is using "snapshot" of values of used session
>> variables
>> > when query is started. This "snapshot" is immutable, so we don't need
>> to make
>> > a copy more times.
>> >
>> > I changed this comment
>>
>> Thanks.
>>
>> > > - parallel safety
>> > >
>> > >   > --- a/src/backend/optimizer/util/clauses.c
>> > >   > +++ b/src/backend/optimizer/util/clauses.c
>> > >   > @@ -935,6 +936,13 @@ max_parallel_hazard_walker(Node *node,
>> max_parallel_hazard_context *context)
>> > >   >         if (param->paramkind == PARAM_EXTERN)
>> > >   >             return false;
>> > >   >
>> > >   > +       /* We doesn't support passing session variables to
>> workers */
>> > >   > +       if (param->paramkind == PARAM_VARIABLE)
>> > >   > +       {
>> > >   > +           if (max_parallel_hazard_test(PROPARALLEL_RESTRICTED,
>> context))
>> > >   > +               return true;
>> > >   > +       }
>> > >
>> > >   Even if a later patch alleviates that restriction, this patch
>> should document that
>> > >   session variables imply "parallel restricted".  I have added that
>> to doc/src/sgml/parallel.sgml.
>> > >
>> >
>> > merged (and removed in 0015)
>>
>> Thanks.
>>
>>
>> I hope I can do some more review at some point in the future...
>>
>> I sincerely hope that this patch set gets merged at some point.
>> The big obstacle is that it is so large.  That's probably because it is
>> pretty
>> feature-complete (but, as we have found, not totally free of bugs).
>>
>
> Any feature that builds its own system catalog cannot be short. The first
> two patches have 300KB, like all others and just basic support for reading
> and writing. All other features have 300KB. On second hand almost all code
> is simple, and there are no changes in critical parts of Postgres. The size
> and complexity of JSONPath is level higher. I can throw 200KB from another
> 300KB patch set which can be better for review, but it can be harder to
> maintain this patch set. I'll try it, and I'll send a reduced version.
>
>
>> Judging from the amount of time I put into my review so far, I guess that
>> this
>> patch set would keep a committer busy for several weeks.  Perhaps the
>> only way to
>> get this done would be to make you a committer...
>>
>
> :-)
>
> Unfortunately, I am very bad at languages, I had a lot of problems in
> basic school just with Czech lang, Russian and English was more terrible) -
> so I very appreciate your work on this patch.
>
> Thank you very much
>
> Pavel
>


Regards

Pavel


>
>
>> Yours,
>> Laurenz Albe
>>
>