Thread
-
Re: SQLFunctionCache and generic plans
Alexander Pyhalov <a.pyhalov@postgrespro.ru> — 2025-02-14T15:11:47Z
Hi, folks. I've looked through performance and found that most performance issues was caused by CachedPlanSource machinery itself. At least, a lot of it. And so decided to go along and implement plan cache for sql functions. I'm not sure that it's clean enough, but at least it seems to be working. While working at it I've found issues in RevalidateCachedQuery() and fixed them. What have changed: - now plans are released after execution; - revalidation now takes locks on analyzed_tree; - query tree is copied prior to analyzing in RevalidateCachedQuery(); - queryTree_list in SQLFunctionCache is no necessary and so has gone. Now sql functions plans are actually saved. The most of it is a simplified version of plpgsql plan cache. Perhaps, I've missed something. We have some cases when we don't save CachedPlanSource for future use. One case is for trigger functions (pl_comp.c has some logic to handle them specially, and I didn't want to introduce it so far). Another (more interesting) issue is invalidation. SQL functions have a feature of rewriting query when targetlist doesn't match function call context. I haven't thought this through carefully during last patch version, but luckily got some tests, which showed this behavior. When compiled with RANDOMIZE_ALLOCATED_MEMORY, the following test case dumped core (because after invalidation executor got "name" fields, but expected text): create table t (nspname text, tname text); CREATE OR REPLACE FUNCTION get_basic_attributes_from_pg_tables( _schemaname name, _tablename name) RETURNS TABLE(tname text, tablespace text, owner text) LANGUAGE sql AS $function$ SELECT schemaname || '.' || tablename AS "full name", tablespace AS "tablespace", tableowner AS "tableowner" FROM pg_tables WHERE pg_tables.schemaname = _schemaname AND pg_tables.tablename = _tablename ORDER BY 1; $function$; create or replace function trg_func() RETURNS TRIGGER AS $$ declare t record; begin FOR t IN (SELECT * FROM get_basic_attributes_from_pg_tables(new.nspname, new.tname)) LOOP RAISE WARNING '"% % %"', t.owner, t.tablespace, t.tname; END LOOP; RETURN NEW; END $$ LANGUAGE PLPGSQL; create trigger t_ins_t AFTER INSERT ON t FOR EACH ROW EXECUTE FUNCTION trg_func(); set debug_discard_caches to 1; insert into t values('pg_catalog', 'pg_class'); It's harder to achieve this without permanent cache (harder to trigger revalidation), but it's still possible. What happened here is that during revalidation query plan was rebuilt, but modifications to query tree, made by check_sql_fn_retval() , were lost. To fix this issue: 1) We avoid caching modified plans (and check_sql_fn_retval() now reports if it touched a query tree); 2) For non-cached plans we still need a hack (callback) into CachedPlanSource to rebuild query tree if invalidation happens. This callback rebuilds query tree, using check_sql_fn_retval(). We sure that callback parameters, specifying actual function return type, should not be recalculated, as such plans can appear only during one function execution and are not reused. 3) To prove that result type was not changed between plans execution, we build plans with fixed_result = true. 4) When we get saved plan, prior to using it, we check that result tlist matches the one built while planning function execution. Otherwise, we recreate CachedPlanSource. Well, it appeared more complicated than I've expected, but now it seems simple SQL functions have much better performance. create or replace function fx(int) returns int as $$ select $1 + $1; $$ language sql immutable; create or replace function fx2(int) returns int as $$ select 2 * $1; $$ language sql immutable; create or replace function fx3 (int) returns int immutable begin atomic select $1 + $1; end; create or replace function fx4(int) returns numeric as $$ select $1 + $1; $$ language sql immutable; -- sql function do $$ begin for i in 1..1000000 loop perform fx((random()*100)::int); end loop; end; $$; Time: 3008.869 ms (00:03.009) -- dynamic SQL do $$ begin for i in 1..1000000 loop execute 'select $1 + $1' using (random()*100)::int; end loop; end; $$; Time: 4915.295 ms (00:04.915)do $$ -- pre-parsed function begin for i in 1..1000000 loop perform fx3((random()*100)::int); end loop; end; $$; Time: 2992.166 ms (00:02.992) -- no plan caching due to need in fixing target list: do $$ begin for i in 1..1000000 loop perform fx4((random()*100)::int); end loop; end; $$; Time: 11020.820 ms (00:11.021) -- Best regards, Alexander Pyhalov, Postgres Professional