Thread

  1. Re: Recovering from detoast-related catcache invalidations

    Andres Freund <andres@anarazel.de> — 2025-12-18T19:07:00Z

    Hi,
    
    On 2025-09-29 19:34:47 +0300, Arseniy Mukhin wrote:
    > On Fri, Sep 19, 2025 at 11:11 PM Andres Freund <andres@anarazel.de> wrote:
    > >
    > > On 2025-03-26 07:21:43 -0400, Andres Freund wrote:
    > > > On 2025-01-14 15:13:21 +0200, Heikki Linnakangas wrote:
    > > > > Committed with those fixes. Thanks for the review!
    > > >
    > > > The test doesn't seem entirely stable. E.g.
    > > > https://cirrus-ci.com/task/6166374147424256
    > > > failed spuriously:
    > > >
    > > > [08:52:06.822](0.002s) # issuing query 1 via background psql:
    > > > #     SELECT injection_points_set_local();
    > > > #     SELECT injection_points_attach('catcache-list-miss-systable-scan-started', 'wait');
    > > > [08:52:06.851](0.029s) # results query 1:
    > > > # {
    > > > #   'stderr' => 'background_psql: QUERY_SEPARATOR 1:
    > > > # ',
    > > > #   'stdout' => '
    > > > #
    > > > # background_psql: QUERY_SEPARATOR 1:
    > > > # '
    > > > # }
    > > > [08:52:06.893](0.042s) # issuing query 1 via background psql:
    > > > #     SELECT injection_points_wakeup('catcache-list-miss-systable-scan-started');
    > > > #     SELECT injection_points_detach('catcache-list-miss-systable-scan-started');
    > > > [08:52:06.897](0.004s) # pump_until: process terminated unexpectedly when searching for "(?^:(^|\n)background_psql: QUERY_SEPARATOR 1:\r?\n)" with stream: ""
    > > > process ended prematurely at /tmp/cirrus-ci-build/src/test/perl/PostgreSQL/Test/Utils.pm line 440.
    > > >
    > > >
    > > > 2025-03-25 08:52:06.896 UTC [34240][client backend] [007_catcache_inval.pl][4/2:0] ERROR:  could not find injection point catcache-list-miss-systable-scan-started to wake up
    > > > 2025-03-25 08:52:06.896 UTC [34240][client backend] [007_catcache_inval.pl][4/2:0] STATEMENT:  SELECT injection_points_wakeup('catcache-list-miss-systable-scan-started');
    > >
    > > And again: https://cirrus-ci.com/task/6082321633247232
    > >
    > > Ping?
    > >
    >
    > The wait_for_event call, which is typically used with a wait injection
    > point, is missing. Could this be the cause of instability? If this
    > makes sense, please find the attached fix.
    
    I was just reminded of this thread because I saw the failure again:
    https://cirrus-ci.com/task/5859971612540928
    (it's unrelated to the patch)
    
    I think you might be right - the wait point might not yet have been reached,
    because the query_until() just waits for "starting_bg_psql" being printed by
       \echo starting_bg_psql
       SELECT foofunc(1);
    
    while the wait point is only hit during the "SELECT foofunc(1)'. There's no
    guarantee that we will have reached the wait point by this point.
    
    I found that I can reproduce the issue with
    
    --- i/src/test/modules/test_misc/t/007_catcache_inval.pl
    +++ w/src/test/modules/test_misc/t/007_catcache_inval.pl
    @@ -53,6 +53,7 @@ my $psql_session2 = $node->background_psql('postgres');
     # catcache list
     $psql_session->query_safe(
         qq[
    +    SELECT pg_sleep(0.1);
         SELECT injection_points_set_local();
         SELECT injection_points_attach('catcache-list-miss-systable-scan-started', 'wait');
     ]);
    @@ -62,6 +63,7 @@ $psql_session->query_safe(
     $psql_session->query_until(
         qr/starting_bg_psql/, q(
        \echo starting_bg_psql
    +   SELECT pg_sleep(3);
        SELECT foofunc(1);
     ));
     
    
    (the first SELECT just is there to later avoid hitting the injection point, by
    already having loaded the cache entry for pg_sleep).
    
    
    And indeed your patch fixes that.
    
    
    Greetings,
    
    Andres Freund