Re: SQLFunctionCache and generic plans

Alexander Pyhalov <a.pyhalov@postgrespro.ru>

From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Pavel Stehule <pavel.stehule@gmail.com>, Alexander Korotkov <aekorotkov@gmail.com>, pgsql-hackers@lists.postgresql.org, Ronan Dunklau <ronan.dunklau@aiven.io>
Date: 2025-02-14T15:11:47Z
Lists: pgsql-hackers

Attachments

Hi, folks.

I've looked through performance and found that most performance issues 
was caused by CachedPlanSource machinery itself. At least, a lot of it. 
And so decided to go along and implement plan cache for sql functions. 
I'm not sure that it's clean enough, but at least it seems to be 
working. While working at it I've found issues in 
RevalidateCachedQuery() and fixed them. What have changed:
- now plans are released after execution;
- revalidation now takes locks on analyzed_tree;
- query tree is copied prior to analyzing in RevalidateCachedQuery();
- queryTree_list in SQLFunctionCache is no necessary and so has gone.

Now sql functions plans are actually saved. The most of it is a 
simplified version of plpgsql plan cache. Perhaps, I've missed 
something.
We have some cases when we don't save CachedPlanSource for future use. 
One case is for trigger functions (pl_comp.c has some logic to handle 
them specially, and I didn't want to introduce it so far). Another (more 
interesting) issue is invalidation. SQL functions have a feature of 
rewriting query when targetlist doesn't match function call context. I 
haven't thought this through carefully during last patch version, but 
luckily got some tests, which showed this behavior. When compiled with 
RANDOMIZE_ALLOCATED_MEMORY, the following test case dumped core (because 
after invalidation executor got "name" fields, but expected text):

create table t (nspname text, tname  text);

CREATE OR REPLACE FUNCTION get_basic_attributes_from_pg_tables( 
_schemaname name, _tablename name)
  RETURNS TABLE(tname text, tablespace text, owner text)
  LANGUAGE sql
AS $function$
     SELECT
         schemaname || '.' || tablename AS "full name",
         tablespace                     AS "tablespace",
         tableowner                     AS "tableowner"
     FROM pg_tables
     WHERE pg_tables.schemaname = _schemaname AND pg_tables.tablename = 
_tablename
     ORDER BY 1;
$function$;

create or replace function trg_func() RETURNS TRIGGER
AS
$$
declare
     t record;
begin
   FOR t IN (SELECT * FROM  
get_basic_attributes_from_pg_tables(new.nspname, new.tname)) LOOP
     RAISE WARNING '"% % %"', t.owner, t.tablespace, t.tname;
   END LOOP;
   RETURN NEW;
END
$$ LANGUAGE PLPGSQL;

create trigger t_ins_t AFTER INSERT ON t FOR EACH ROW EXECUTE FUNCTION 
trg_func();

set debug_discard_caches to 1;
insert into t values('pg_catalog', 'pg_class');

It's harder to achieve this without permanent cache (harder to trigger 
revalidation), but it's still possible.

What happened here is that during revalidation query plan was rebuilt, 
but modifications to query tree, made  by check_sql_fn_retval() , were 
lost.

To fix this issue:
1) We avoid caching modified plans (and check_sql_fn_retval() now 
reports if it touched a query tree);
2) For non-cached plans we still need a hack (callback) into 
CachedPlanSource to rebuild query tree if invalidation happens. This 
callback rebuilds query tree, using check_sql_fn_retval(). We sure that 
callback parameters, specifying actual function return type, should not 
be recalculated, as such plans can appear only during one function 
execution and are not reused.
3) To prove that result type was not changed between plans execution, we 
build plans with fixed_result = true.
4) When we get saved plan, prior to using it, we check that result tlist 
matches the one built while planning function execution. Otherwise, we 
recreate CachedPlanSource.

Well, it appeared more complicated than I've expected, but now it seems 
simple SQL functions have much better performance.

create or replace function fx(int) returns int as $$ select $1 + $1; $$ 
language sql immutable;
create or replace function fx2(int) returns int as $$ select 2 * $1; $$ 
language sql immutable;
create or replace function fx3 (int) returns int immutable begin atomic 
select $1 + $1; end;
create or replace function fx4(int) returns numeric as $$ select $1 + 
$1; $$ language sql immutable;

-- sql function
do $$
begin
   for i in 1..1000000 loop
     perform fx((random()*100)::int);
   end loop;
end;
$$;
Time: 3008.869 ms (00:03.009)


-- dynamic SQL
do
$$ begin
   for i in 1..1000000 loop
     execute 'select $1 + $1' using (random()*100)::int;
   end loop;
end;
$$;
Time: 4915.295 ms (00:04.915)do $$

-- pre-parsed function
begin
   for i in 1..1000000 loop
     perform fx3((random()*100)::int);
   end loop;
end;
$$;
Time: 2992.166 ms (00:02.992)

-- no plan caching due to need in fixing target list:
do $$
begin
   for i in 1..1000000 loop
     perform fx4((random()*100)::int);
   end loop;
end;
$$;
Time: 11020.820 ms (00:11.021)


-- 
Best regards,
Alexander Pyhalov,
Postgres Professional