Thread

  1. Re: SQLFunctionCache and generic plans

    Alexander Pyhalov <a.pyhalov@postgrespro.ru> — 2025-02-14T15:11:47Z

    Hi, folks.
    
    I've looked through performance and found that most performance issues 
    was caused by CachedPlanSource machinery itself. At least, a lot of it. 
    And so decided to go along and implement plan cache for sql functions. 
    I'm not sure that it's clean enough, but at least it seems to be 
    working. While working at it I've found issues in 
    RevalidateCachedQuery() and fixed them. What have changed:
    - now plans are released after execution;
    - revalidation now takes locks on analyzed_tree;
    - query tree is copied prior to analyzing in RevalidateCachedQuery();
    - queryTree_list in SQLFunctionCache is no necessary and so has gone.
    
    Now sql functions plans are actually saved. The most of it is a 
    simplified version of plpgsql plan cache. Perhaps, I've missed 
    something.
    We have some cases when we don't save CachedPlanSource for future use. 
    One case is for trigger functions (pl_comp.c has some logic to handle 
    them specially, and I didn't want to introduce it so far). Another (more 
    interesting) issue is invalidation. SQL functions have a feature of 
    rewriting query when targetlist doesn't match function call context. I 
    haven't thought this through carefully during last patch version, but 
    luckily got some tests, which showed this behavior. When compiled with 
    RANDOMIZE_ALLOCATED_MEMORY, the following test case dumped core (because 
    after invalidation executor got "name" fields, but expected text):
    
    create table t (nspname text, tname  text);
    
    CREATE OR REPLACE FUNCTION get_basic_attributes_from_pg_tables( 
    _schemaname name, _tablename name)
      RETURNS TABLE(tname text, tablespace text, owner text)
      LANGUAGE sql
    AS $function$
         SELECT
             schemaname || '.' || tablename AS "full name",
             tablespace                     AS "tablespace",
             tableowner                     AS "tableowner"
         FROM pg_tables
         WHERE pg_tables.schemaname = _schemaname AND pg_tables.tablename = 
    _tablename
         ORDER BY 1;
    $function$;
    
    create or replace function trg_func() RETURNS TRIGGER
    AS
    $$
    declare
         t record;
    begin
       FOR t IN (SELECT * FROM  
    get_basic_attributes_from_pg_tables(new.nspname, new.tname)) LOOP
         RAISE WARNING '"% % %"', t.owner, t.tablespace, t.tname;
       END LOOP;
       RETURN NEW;
    END
    $$ LANGUAGE PLPGSQL;
    
    create trigger t_ins_t AFTER INSERT ON t FOR EACH ROW EXECUTE FUNCTION 
    trg_func();
    
    set debug_discard_caches to 1;
    insert into t values('pg_catalog', 'pg_class');
    
    It's harder to achieve this without permanent cache (harder to trigger 
    revalidation), but it's still possible.
    
    What happened here is that during revalidation query plan was rebuilt, 
    but modifications to query tree, made  by check_sql_fn_retval() , were 
    lost.
    
    To fix this issue:
    1) We avoid caching modified plans (and check_sql_fn_retval() now 
    reports if it touched a query tree);
    2) For non-cached plans we still need a hack (callback) into 
    CachedPlanSource to rebuild query tree if invalidation happens. This 
    callback rebuilds query tree, using check_sql_fn_retval(). We sure that 
    callback parameters, specifying actual function return type, should not 
    be recalculated, as such plans can appear only during one function 
    execution and are not reused.
    3) To prove that result type was not changed between plans execution, we 
    build plans with fixed_result = true.
    4) When we get saved plan, prior to using it, we check that result tlist 
    matches the one built while planning function execution. Otherwise, we 
    recreate CachedPlanSource.
    
    Well, it appeared more complicated than I've expected, but now it seems 
    simple SQL functions have much better performance.
    
    create or replace function fx(int) returns int as $$ select $1 + $1; $$ 
    language sql immutable;
    create or replace function fx2(int) returns int as $$ select 2 * $1; $$ 
    language sql immutable;
    create or replace function fx3 (int) returns int immutable begin atomic 
    select $1 + $1; end;
    create or replace function fx4(int) returns numeric as $$ select $1 + 
    $1; $$ language sql immutable;
    
    -- sql function
    do $$
    begin
       for i in 1..1000000 loop
         perform fx((random()*100)::int);
       end loop;
    end;
    $$;
    Time: 3008.869 ms (00:03.009)
    
    
    -- dynamic SQL
    do
    $$ begin
       for i in 1..1000000 loop
         execute 'select $1 + $1' using (random()*100)::int;
       end loop;
    end;
    $$;
    Time: 4915.295 ms (00:04.915)do $$
    
    -- pre-parsed function
    begin
       for i in 1..1000000 loop
         perform fx3((random()*100)::int);
       end loop;
    end;
    $$;
    Time: 2992.166 ms (00:02.992)
    
    -- no plan caching due to need in fixing target list:
    do $$
    begin
       for i in 1..1000000 loop
         perform fx4((random()*100)::int);
       end loop;
    end;
    $$;
    Time: 11020.820 ms (00:11.021)
    
    
    -- 
    Best regards,
    Alexander Pyhalov,
    Postgres Professional