Thread

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Eliminate duplicative hashtempcxt in nodeSubplan.c.

  2. Fix memory leakage in nodeSubplan.c.

  3. Do execGrouping.c via expression eval machinery, take two.

  4. Fix potential failure when hashing the output of a subplan that produces

  1. BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    PG Bug reporting form <noreply@postgresql.org> — 2025-09-02T15:58:49Z

    The following bug has been logged on the website:
    
    Bug reference:      19040
    Logged by:          haiyang li
    Email address:      mohen.lhy@alibaba-inc.com
    PostgreSQL version: 18beta3
    Operating system:   centos7 5.10.84 x86_64
    Description:        
    
    Hello, all!
    
    I found a query which consumes a lot of memory and triggers OOM killer.
    Memory leak occurs in hashed subplan node.
    
    I was able to create reproducible test case on machine with default config
    and postgresql 18beta3:
    
    CREATE TABLE test1(
        a numeric,
        b int);
    INSERT INTO
        test1
    SELECT
        i,
        i
    FROM
        generate_series(1, 30000000) i; -- Make the running time longer
    EXPLAIN ANALYZE SELECT
        *
    FROM
        test1
    WHERE
        a NOT IN(
            SELECT
                i
            FROM
                generate_series(1, 10000) i
    );
    
    plan:
                                                                 QUERY PLAN
    ------------------------------------------------------------------------------------------------------------------------------------
     Seq Scan on test1  (cost=125.00..612432.24 rows=15000108 width=10) (actual
    time=135.191..25832.808 rows=29990000 loops=1)
       Filter: (NOT (hashed SubPlan 1))
       Rows Removed by Filter: 10000
       SubPlan 1
         ->  Function Scan on generate_series i  (cost=0.00..100.00 rows=10000
    width=4) (actual time=36.999..38.296 rows=10000 loops=1)
     Planning Time: 0.280 ms
     JIT:
       Functions: 15
       Options: Inlining true, Optimization true, Expressions true, Deforming
    true
       Timing: Generation 1.155 ms, Inlining 25.929 ms, Optimization 60.700 ms,
    Emission 23.018 ms, Total 110.802 ms
     Execution Time: 28217.026 ms
    (11 rows)
    
    
    I observed that the process's RES (resident memory) was increasing rapidly
    during SQL execution by using 'top -p <pid>' command.
    
    Furthermore, during SQL execution, I ran 'select
    pg_log_backend_memory_contexts(<pid>)'
    to print memory context statistics. The context with abnormally high memory
    usage was
    "Subplan HashTable Temp Context." The key part of the log is as follows:
    
    ...
    LOG:  level: 5; Subplan HashTable Temp Context: 514834432 total in 62849
    blocks; 973712 free (60695 chunks); 513860720 used
    LOG:  level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
    free (5 chunks); 400480 used
    ...
    Grand total: 518275344 bytes in 63116 blocks; 2025560 free (60976 chunks);
    516249784 used
    ...
    
    If I change the SQL from "a NOT IN" to "b NOT IN" and do the same action, I
    can not
    observe abnormally high memory usage. Likewise, the key part of the log is
    as follows:
    ...
    LOG:  level: 5; Subplan HashTable Temp Context: 1024 total in 1 blocks; 784
    free (0 chunks); 240 used
    LOG:  level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
    free (5 chunks); 400480 used
    ...
    Grand total: 3441936 bytes in 268 blocks; 1050520 free (281 chunks); 2391416
    used
    ...
    
    While analyzing the source code, I found that the hashed subplan node fails
    to reset
    the 'hashtempcxt' context after probing the hash table for each slot.
    
    When variable-length datatypes (e.g., numeric) are processed, this can
    trigger calls
    to 'detoast_attr', which allocate memory in hashtempcxt. Without a reset,
    this memory
    is not reclaimed until the context itself is destroyed, resulting in a
    memory leak
    when processing large numbers of slots.
    
    A patch implementing this fix will be included in the follow-up email.
    
    --
    Thanks,
    Haiyang Li
    
    
  2. 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Haiyang Li <mohen.lhy@alibaba-inc.com> — 2025-09-03T00:06:31Z

    I've attached 'v01_fix_memory_leak_in_hashed_subplan_node.patch' to address this.
    --
    Thanks,
    Haiyang Li
    ------------------------------------------------------------------
    发件人:PG Bug reporting form <noreply@postgresql.org>
    发送时间:2025年9月3日(周三) 01:27
    收件人:"pgsql-bugs"<pgsql-bugs@lists.postgresql.org>
    抄 送:"李海洋(陌痕)"<mohen.lhy@alibaba-inc.com>
    主 题:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
    The following bug has been logged on the website:
    Bug reference: 19040
    Logged by: haiyang li
    Email address: mohen.lhy@alibaba-inc.com
    PostgreSQL version: 18beta3
    Operating system: centos7 5.10.84 x86_64
    Description: 
    Hello, all!
    I found a query which consumes a lot of memory and triggers OOM killer.
    Memory leak occurs in hashed subplan node.
    I was able to create reproducible test case on machine with default config
    and postgresql 18beta3:
    CREATE TABLE test1(
     a numeric,
     b int);
    INSERT INTO
     test1
    SELECT
     i,
     i
    FROM
     generate_series(1, 30000000) i; -- Make the running time longer
    EXPLAIN ANALYZE SELECT
     *
    FROM
     test1
    WHERE
     a NOT IN(
     SELECT
     i
     FROM
     generate_series(1, 10000) i
    );
    plan:
     QUERY PLAN
    ------------------------------------------------------------------------------------------------------------------------------------
     Seq Scan on test1 (cost=125.00..612432.24 rows=15000108 width=10) (actual
    time=135.191..25832.808 rows=29990000 loops=1)
     Filter: (NOT (hashed SubPlan 1))
     Rows Removed by Filter: 10000
     SubPlan 1
     -> Function Scan on generate_series i (cost=0.00..100.00 rows=10000
    width=4) (actual time=36.999..38.296 rows=10000 loops=1)
     Planning Time: 0.280 ms
     JIT:
     Functions: 15
     Options: Inlining true, Optimization true, Expressions true, Deforming
    true
     Timing: Generation 1.155 ms, Inlining 25.929 ms, Optimization 60.700 ms,
    Emission 23.018 ms, Total 110.802 ms
     Execution Time: 28217.026 ms
    (11 rows)
    I observed that the process's RES (resident memory) was increasing rapidly
    during SQL execution by using 'top -p <pid>' command.
    Furthermore, during SQL execution, I ran 'select
    pg_log_backend_memory_contexts(<pid>)'
    to print memory context statistics. The context with abnormally high memory
    usage was
    "Subplan HashTable Temp Context." The key part of the log is as follows:
    ...
    LOG: level: 5; Subplan HashTable Temp Context: 514834432 total in 62849
    blocks; 973712 free (60695 chunks); 513860720 used
    LOG: level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
    free (5 chunks); 400480 used
    ...
    Grand total: 518275344 bytes in 63116 blocks; 2025560 free (60976 chunks);
    516249784 used
    ...
    If I change the SQL from "a NOT IN" to "b NOT IN" and do the same action, I
    can not
    observe abnormally high memory usage. Likewise, the key part of the log is
    as follows:
    ...
    LOG: level: 5; Subplan HashTable Temp Context: 1024 total in 1 blocks; 784
    free (0 chunks); 240 used
    LOG: level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
    free (5 chunks); 400480 used
    ...
    Grand total: 3441936 bytes in 268 blocks; 1050520 free (281 chunks); 2391416
    used
    ...
    While analyzing the source code, I found that the hashed subplan node fails
    to reset
    the 'hashtempcxt' context after probing the hash table for each slot.
    When variable-length datatypes (e.g., numeric) are processed, this can
    trigger calls
    to 'detoast_attr', which allocate memory in hashtempcxt. Without a reset,
    this memory
    is not reclaimed until the context itself is destroyed, resulting in a
    memory leak
    when processing large numbers of slots.
    A patch implementing this fix will be included in the follow-up email.
    --
    Thanks,
    Haiyang Li
    
  3. Re:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    ocean_li_996 <ocean_li_996@163.com> — 2025-09-03T01:43:12Z

    For reasons I'm not yet aware of, my email reply to the bug report
    did not show up in the discussion thread. I am therefore resending
    the patch in a new email.
    
    
    I've attached 'v01_fix_memory_leak_in_hashed_subplan_node.patch' to address this.
    
    
    --
    Thanks,
    Haiyang Li
    
    
    
    ----- Original Message -----
    From: "PG Bug reporting form" <noreply@postgresql.org>
    To: pgsql-bugs@lists.postgresql.org
    Cc: mohen.lhy@alibaba-inc.com
    Sent: Tue, 02 Sep 2025 15:58:49 +0000
    Subject: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
    
    The following bug has been logged on the website:
    
    Bug reference:      19040
    Logged by:          haiyang li
    Email address:      mohen.lhy@alibaba-inc.com
    PostgreSQL version: 18beta3
    Operating system:   centos7 5.10.84 x86_64
    Description:        
    
    Hello, all!
    
    I found a query which consumes a lot of memory and triggers OOM killer.
    Memory leak occurs in hashed subplan node.
    
    I was able to create reproducible test case on machine with default config
    and postgresql 18beta3:
    
    CREATE TABLE test1(
        a numeric,
        b int);
    INSERT INTO
        test1
    SELECT
        i,
        i
    FROM
        generate_series(1, 30000000) i; -- Make the running time longer
    EXPLAIN ANALYZE SELECT
        *
    FROM
        test1
    WHERE
        a NOT IN(
            SELECT
                i
            FROM
                generate_series(1, 10000) i
    );
    
    plan:
                                                                 QUERY PLAN
    ------------------------------------------------------------------------------------------------------------------------------------
     Seq Scan on test1  (cost=125.00..612432.24 rows=15000108 width=10) (actual
    time=135.191..25832.808 rows=29990000 loops=1)
       Filter: (NOT (hashed SubPlan 1))
       Rows Removed by Filter: 10000
       SubPlan 1
         ->  Function Scan on generate_series i  (cost=0.00..100.00 rows=10000
    width=4) (actual time=36.999..38.296 rows=10000 loops=1)
     Planning Time: 0.280 ms
     JIT:
       Functions: 15
       Options: Inlining true, Optimization true, Expressions true, Deforming
    true
       Timing: Generation 1.155 ms, Inlining 25.929 ms, Optimization 60.700 ms,
    Emission 23.018 ms, Total 110.802 ms
     Execution Time: 28217.026 ms
    (11 rows)
    
    
    I observed that the process's RES (resident memory) was increasing rapidly
    during SQL execution by using 'top -p <pid>' command.
    
    Furthermore, during SQL execution, I ran 'select
    pg_log_backend_memory_contexts(<pid>)'
    to print memory context statistics. The context with abnormally high memory
    usage was
    "Subplan HashTable Temp Context." The key part of the log is as follows:
    
    ...
    LOG:  level: 5; Subplan HashTable Temp Context: 514834432 total in 62849
    blocks; 973712 free (60695 chunks); 513860720 used
    LOG:  level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
    free (5 chunks); 400480 used
    ...
    Grand total: 518275344 bytes in 63116 blocks; 2025560 free (60976 chunks);
    516249784 used
    ...
    
    If I change the SQL from "a NOT IN" to "b NOT IN" and do the same action, I
    can not
    observe abnormally high memory usage. Likewise, the key part of the log is
    as follows:
    ...
    LOG:  level: 5; Subplan HashTable Temp Context: 1024 total in 1 blocks; 784
    free (0 chunks); 240 used
    LOG:  level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
    free (5 chunks); 400480 used
    ...
    Grand total: 3441936 bytes in 268 blocks; 1050520 free (281 chunks); 2391416
    used
    ...
    
    While analyzing the source code, I found that the hashed subplan node fails
    to reset
    the 'hashtempcxt' context after probing the hash table for each slot.
    
    When variable-length datatypes (e.g., numeric) are processed, this can
    trigger calls
    to 'detoast_attr', which allocate memory in hashtempcxt. Without a reset,
    this memory
    is not reclaimed until the context itself is destroyed, resulting in a
    memory leak
    when processing large numbers of slots.
    
    A patch implementing this fix will be included in the follow-up email.
    
    --
    Thanks,
    Haiyang Li
    
    
  4. Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    feichanghong <feichanghong@qq.com> — 2025-09-03T02:20:48Z

    Hi,
    
    > On Sep 3, 2025, at 09:43, ocean_li_996 <ocean_li_996@163.com> wrote:
    > I've attached 'v01_fix_memory_leak_in_hashed_subplan_node.patch' to address this.
    
    It seems this issue has been around for many years. I took a quick
    look at the patch for fixing it. Why don't we reset the temp context
    in the LookupTupleHashEntry, TupleHashTableHash, LookupTupleHashEntryHash,
    and FindTupleHashEntry functions? This seems more robust. Furthermore,
    the added test case doesn't seem to detect whether there's a memory leak.
    
    > 
    > ----- Original Message ----- 
    > From: "PG Bug reporting form" <noreply@postgresql.org> 
    > To: pgsql-bugs@lists.postgresql.org 
    > Cc: mohen.lhy@alibaba-inc.com 
    > Sent: Tue, 02 Sep 2025 15:58:49 +0000 
    > Subject: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset 
    > 
    > The following bug has been logged on the website:
    > 
    > Bug reference:      19040
    > Logged by:          haiyang li
    > Email address:      mohen.lhy@alibaba-inc.com
    > PostgreSQL version: 18beta3
    > Operating system:   centos7 5.10.84 x86_64
    > Description:        
    > 
    > Hello, all!
    > 
    > I found a query which consumes a lot of memory and triggers OOM killer.
    > Memory leak occurs in hashed subplan node.
    > 
    > I was able to create reproducible test case on machine with default config
    > and postgresql 18beta3:
    > 
    > CREATE TABLE test1(
    >     a numeric,
    >     b int);
    > INSERT INTO
    >     test1
    > SELECT
    >     i,
    >     i
    > FROM
    >     generate_series(1, 30000000) i; -- Make the running time longer
    > EXPLAIN ANALYZE SELECT
    >     *
    > FROM
    >     test1
    > WHERE
    >     a NOT IN(
    >         SELECT
    >             i
    >         FROM
    >             generate_series(1, 10000) i
    > );
    > 
    > plan:
    >                                                              QUERY PLAN
    > ------------------------------------------------------------------------------------------------------------------------------------
    >  Seq Scan on test1  (cost=125.00..612432.24 rows=15000108 width=10) (actual
    > time=135.191..25832.808 rows=29990000 loops=1)
    >    Filter: (NOT (hashed SubPlan 1))
    >    Rows Removed by Filter: 10000
    >    SubPlan 1
    >      ->  Function Scan on generate_series i  (cost=0.00..100.00 rows=10000
    > width=4) (actual time=36.999..38.296 rows=10000 loops=1)
    >  Planning Time: 0.280 ms
    >  JIT:
    >    Functions: 15
    >    Options: Inlining true, Optimization true, Expressions true, Deforming
    > true
    >    Timing: Generation 1.155 ms, Inlining 25.929 ms, Optimization 60.700 ms,
    > Emission 23.018 ms, Total 110.802 ms
    >  Execution Time: 28217.026 ms
    > (11 rows)
    > 
    > 
    > I observed that the process's RES (resident memory) was increasing rapidly
    > during SQL execution by using 'top -p <pid>' command.
    > 
    > Furthermore, during SQL execution, I ran 'select
    > pg_log_backend_memory_contexts(<pid>)'
    > to print memory context statistics. The context with abnormally high memory
    > usage was
    > "Subplan HashTable Temp Context." The key part of the log is as follows:
    > 
    > ...
    > LOG:  level: 5; Subplan HashTable Temp Context: 514834432 total in 62849
    > blocks; 973712 free (60695 chunks); 513860720 used
    > LOG:  level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
    > free (5 chunks); 400480 used
    > ...
    > Grand total: 518275344 bytes in 63116 blocks; 2025560 free (60976 chunks);
    > 516249784 used
    > ...
    > 
    > If I change the SQL from "a NOT IN" to "b NOT IN" and do the same action, I
    > can not
    > observe abnormally high memory usage. Likewise, the key part of the log is
    > as follows:
    > ...
    > LOG:  level: 5; Subplan HashTable Temp Context: 1024 total in 1 blocks; 784
    > free (0 chunks); 240 used
    > LOG:  level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
    > free (5 chunks); 400480 used
    > ...
    > Grand total: 3441936 bytes in 268 blocks; 1050520 free (281 chunks); 2391416
    > used
    > ...
    > 
    > While analyzing the source code, I found that the hashed subplan node fails
    > to reset
    > the 'hashtempcxt' context after probing the hash table for each slot.
    > 
    > When variable-length datatypes (e.g., numeric) are processed, this can
    > trigger calls
    > to 'detoast_attr', which allocate memory in hashtempcxt. Without a reset,
    > this memory
    > is not reclaimed until the context itself is destroyed, resulting in a
    > memory leak
    > when processing large numbers of slots.
    
    Additionally, through testing, I've found that if test1 is a partitioned
    table, multiple "Subplan HashTable Context" instances exist, and these
    MemoryContexts will only be released after execution is complete. If the
    number of subpartitions is large, it can lead to significant memory
    usage. Doesn't this differ from what the execution plan shows? The plan
    only displays one occurrence of SubPlan.
    
    
    Best Regards,
    Fei Changhong
    
    
  5. Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-03T15:35:26Z

    feichanghong <feichanghong@qq.com> writes:
    > It seems this issue has been around for many years. I took a quick
    > look at the patch for fixing it. Why don't we reset the temp context
    > in the LookupTupleHashEntry, TupleHashTableHash, LookupTupleHashEntryHash,
    > and FindTupleHashEntry functions? This seems more robust.
    
    No, I disagree with that.  MemoryContextReset is not free.  The
    existing code seems to expect that the hash tempcxt will be a
    per-tuple context or similar, which will be reset once per executor
    cycle by existing mechanisms.  I wonder why ExecInitSubPlan is
    making a separate context for this at all, rather than using some
    surrounding short-lived context.
    
    If we do keep a separate context, I agree with the idea of one
    MemoryContextReset in the exit of ExecHashSubPlan, but the proposed
    patch seems like a mess.  I think what we ought to do is refactor
    ExecHashSubPlan so that there's exactly one "ExecClearTuple(slot)"
    down at the bottom, and then we can add a MemoryContextReset after it.
    
    > Furthermore,
    > the added test case doesn't seem to detect whether there's a memory leak.
    
    Yeah, test cases for memory leaks are problematic.  The only way they
    can really "detect" one is if they run so long as to be pretty much
    guaranteed to hit OOM, which is (a) impossible to quantify across
    a range of platforms and (b) not something we'd care to commit anyway.
    
    It's good if we have an example that one can watch to confirm
    it-leaks-or-not by monitoring the process's memory consumption,
    but I don't foresee committing it.
    
    			regards, tom lane
    
    
    
    
  6. 回复:Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Haiyang Li <mohen.lhy@alibaba-inc.com> — 2025-09-03T16:59:10Z

    Tom Lane  writes:
    
    
     If we do keep a separate context, I agree with the idea of one
     MemoryContextReset in the exit of ExecHashSubPlan, but the proposed
     patch seems like a mess.  I think what we ought to do is refactor
     ExecHashSubPlan so that there's exactly one "ExecClearTuple(slot)”
     down at the bottom, and then we can add a MemoryContextReset after it.
    
    
    The proposed patch was inspired by the approach used in ExecRecursiveUnion.
    
    Refactoring ExecHashSubPlan would be a better long‑term solution.
    
    
     It's good if we have an example that one can watch to confirm
     it-leaks-or-not by monitoring the process's memory consumption,
     but I don't foresee committing it.
    
    
    Should we omit the test case, or add one in the same form as in the patch?
    
    
    —
    
    Thanks,
    
    Haying Li
    
    
    
    
    
    
    
    
    
    
      	
     李海洋 
    阿里巴巴及蚂蚁集团  
     邮箱:mohen.lhy@alibaba-inc.com 
     地址:浙江-杭州-云谷园区 1-3C-577 
    	     
    	   		 阿里巴巴及蚂蚁集团  	 企业主页  		      
     信息安全声明:本邮件包含信息归发件人所在组织所有,发件人所在组织对该邮件拥有所有权利。
    请接收者注意保密,未经发件人书面许可,不得向任何第三方组织和个人透露本邮件所含信息的全部或部分。以上声明仅适用于工作邮件。
    Information Security Notice: The information contained in this mail is solely property of the sender's organization. 
     This mail communication is confidential. Recipients named above are obligated to maintain secrecy and are not permitted to disclose the contents of this communication to others.  ------------------------------------------------------------------
    发件人:Tom Lane<tgl@sss.pgh.pa.us>
    日 期:2025年09月03日 23:35:26
    收件人:feichanghong<feichanghong@qq.com>
    抄 送:ocean_li_996<ocean_li_996@163.com>; 李海洋(陌痕)<mohen.lhy@alibaba-inc.com>; pgsql-bugs@lists.postgresql.org<pgsql-bugs@lists.postgresql.org>; <jdavis@postgresql.org>
    主 题:Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
    
    feichanghong <feichanghong@qq.com> writes:
    > It seems this issue has been around for many years. I took a quick
    > look at the patch for fixing it. Why don't we reset the temp context
    > in the LookupTupleHashEntry, TupleHashTableHash, LookupTupleHashEntryHash,
    > and FindTupleHashEntry functions? This seems more robust.
    
    No, I disagree with that.  MemoryContextReset is not free.  The
    existing code seems to expect that the hash tempcxt will be a
    per-tuple context or similar, which will be reset once per executor
    cycle by existing mechanisms.  I wonder why ExecInitSubPlan is
    making a separate context for this at all, rather than using some
    surrounding short-lived context.
    
    If we do keep a separate context, I agree with the idea of one
    MemoryContextReset in the exit of ExecHashSubPlan, but the proposed
    patch seems like a mess.  I think what we ought to do is refactor
    ExecHashSubPlan so that there's exactly one "ExecClearTuple(slot)"
    down at the bottom, and then we can add a MemoryContextReset after it.
    
    > Furthermore,
    > the added test case doesn't seem to detect whether there's a memory leak.
    
    Yeah, test cases for memory leaks are problematic.  The only way they
    can really "detect" one is if they run so long as to be pretty much
    guaranteed to hit OOM, which is (a) impossible to quantify across
    a range of platforms and (b) not something we'd care to commit anyway.
    
    It's good if we have an example that one can watch to confirm
    it-leaks-or-not by monitoring the process's memory consumption,
    but I don't foresee committing it.
    
     regards, tom lane
    
  7. 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Haiyang Li <mohen.lhy@alibaba-inc.com> — 2025-09-05T05:13:20Z

    On 2025-09-03 23:35 Tom Lane <tgl@sss.pgh.pa.us> writes:
    > I wonder why ExecInitSubPlan is making a separate context for this at all,
    > rather than using some surrounding short-lived context.
    This behavior was introduced by commit 133924e to fix one bug. AFAICS, the
    tempcxt is only used by hashfunc evaluation. We should reset tempcxt after per 
    hashtable find if tempcxt is a separate context.
    > If we do keep a separate context, I agree with the idea of one
    > MemoryContextReset in the exit of ExecHashSubPlan, but the proposed
    > patch seems like a mess. I think what we ought to do is refactor
    > ExecHashSubPlan so that there's exactly one "ExecClearTuple(slot)”
    > down at the bottom, and then we can add a MemoryContextReset after it.
    Based on this thought, I have implemented a new patch in attachment. 
    It’s worth noting that a similar issue also exists in buildSubPlanHash. The leak
    occurs while building the hash table from the subplan’s result set. If the test SQL
    in [1] is modified to
    ```
    EXPLAIN ANALYZE
     SELECT * FROM test1
     WHERE a NOT IN 
     (SELECT a FROM test1 limit 100000);
    ```
     (and work_mem is adjusted so that the plan uses a hashed subplan), you can
    confirm it. To address this, I have also added a MemoryContextReset tempcxt
    after each hash table probe in buildSubPlanHash.
    Additional, the patch does not include a test case.
    Looking forward to your feedback.
    —
    Thanks
    Haiyang Li
    [1] https://www.postgresql.org/message-id/19040-c9b6073ef814f48c%40postgresql.org <https://www.postgresql.org/message-id/19040-c9b6073ef814f48c%40postgresql.org >
    
  8. Re: 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-06T20:31:53Z

    "=?UTF-8?B?5p2O5rW35rSLKOmZjOeXlSk=?=" <mohen.lhy@alibaba-inc.com> writes:
    > On 2025-09-03 23:35 Tom Lane <tgl@sss.pgh.pa.us> writes:
    >> I wonder why ExecInitSubPlan is making a separate context for this at all,
    >> rather than using some surrounding short-lived context.
    
    > This behavior was introduced by commit 133924e to fix one bug. AFAICS, the
    > tempcxt is only used by hashfunc evaluation. We should reset tempcxt after per 
    > hashtable find if tempcxt is a separate context.
    
    I thought it unlikely that this leak has been there unreported since
    2010, so I tried the test case in older branches and soon found that
    v10 and older don't exhibit the leak, or at least it's much less
    virulent there.  A "git bisect" session found that the behavior
    changed at
    
        bf6c614a2f2c58312b3be34a47e7fb7362e07bcb is the first bad commit
        commit bf6c614a2f2c58312b3be34a47e7fb7362e07bcb
        Author: Andres Freund <andres@anarazel.de>
        Date:   Thu Feb 15 21:55:31 2018 -0800
    
            Do execGrouping.c via expression eval machinery, take two.
    
    Before that commit, TupleHashTableMatch reset hashtable->tempcxt
    (via execTuplesMatch).  Now what it resets is a different context
    hashtable->exprcontext->ecxt_per_tuple_memory, and so there's no
    reset of hashtable->tempcxt anywhere in this particular loop.
    
    So that leads me to not trust fixing this in nodeSubplan.c,
    because that's just one caller that can reach TupleHashTableMatch:
    assuming that there are no other similar leaks seems dangerous.
    
    I experimented with putting MemoryContextReset(hashtable->tempcxt)
    into TupleHashTableMatch, and that does stop this leak.  But even
    though that's a one-line fix, I don't like that solution either,
    for two reasons:
    
    1. TupleHashTableMatch is likely to be invoked multiple times per
    hashtable lookup, so that this way results in many more resets
    than are really necessary.
    
    2. It's not entirely clear that this way can't clobber data we
    still need, since the hash function outputs are surely longer-lived
    than individual tuple matching operations.
    
    After contemplating things for awhile, I think that feichanghong's
    idea is the right one after all: in each of the functions that switch
    into hashtable->tempcxt, let's do a reset on the way out, as attached.
    That's straightforward and visibly related to the required data
    lifespan.
    
    Interestingly, both in pre-v11 branches and with the one-line fix,
    I notice that the test query's memory consumption bounces around a
    little (10MB or so), while it seems absolutely steady with the
    attached.  I interpret that to mean that we weren't resetting
    the tempcxt quite often enough, so that there was some amount of
    leakage in between calls of TupleHashTableMatch, even though we
    got there often enough to avoid disaster in this particular test.
    That's another reason to prefer this way over other solutions,
    I think.
    
    			regards, tom lane
    
    
  9. 回复:回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Haiyang Li <mohen.lhy@alibaba-inc.com> — 2025-09-07T08:24:05Z

    On 2025-09-06 20:31:53 Tom Lane <tgl@sss.pgh.pa.us> writes:
    > After contemplating things for awhile, I think that feichanghong’s
    > idea is the right one after all: in each of the functions that switch
    > into hashtable->tempcxt, let's do a reset on the way out, as attached.
    > That's straightforward and visibly related to the required data
    > lifespan.
    I have considered this approach as well, but my concern is that "tempcxt"
    is not always an independent memory context. In some cases, it references
    another context — for example, in nodeSetOp.c’s "build_hash_table", “tempcxt"
    points to "setopstate->ps.ps_ExprContext->ecxt_per_tuple_memory". There is
    similar usage in nodeAgg.c as well. I’m not entirely sure that this approach would
    not discard data we still need, because the lifespan of
    "ps_ExprContext->ecxt_per_tuple_memory" seems to be longer than “tempcxt”.
    Should we make tempcxt a completely independent memory context?
    — 
    Thanks
    Haiyang Li
    
  10. Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    feichanghong <feichanghong@qq.com> — 2025-09-07T09:02:23Z

    
    > On Sep 7, 2025, at 16:24, 李海洋(陌痕) <mohen.lhy@alibaba-inc.com> wrote:
    > 
    > On 2025-09-06 20:31:53 Tom Lane <tgl@sss.pgh.pa.us> writes:
    > 
    > > After contemplating things for awhile, I think that feichanghong’s
    > > idea is the right one after all: in each of the functions that switch
    > > into hashtable->tempcxt, let's do a reset on the way out, as attached.
    > > That's straightforward and visibly related to the required data
    > > lifespan.
    > 
    > I have considered this approach as well, but my concern is that "tempcxt"
    > is not always an independent memory context. In some cases, it references
    > another context — for example, in nodeSetOp.c’s "build_hash_table", “tempcxt"
    > points to "setopstate->ps.ps_ExprContext->ecxt_per_tuple_memory". There is
    > similar usage in nodeAgg.c as well. I’m not entirely sure that this approach would
    > not discard data we still need, because the lifespan of
    > "ps_ExprContext->ecxt_per_tuple_memory" seems to be longer than “tempcxt”.
    
    Yes, I agree with that.
    
    > Should we make tempcxt a completely independent memory context?
    
    It looks fine. Perhaps we don't need to pass tempcxt to BuildTupleHashTable,
    but rather create a new one within it. After each switch to tempcxt, we should
    clean it up using MemoryContextReset.
    
    
    Best Regards,
    Fei Changhong
    
    
  11. Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-08T17:23:12Z

    feichanghong <feichanghong@qq.com> writes:
    >> On Sep 7, 2025, at 16:24, 李海洋(陌痕) <mohen.lhy@alibaba-inc.com> wrote:
    >> On 2025-09-06 20:31:53 Tom Lane <tgl@sss.pgh.pa.us> writes:
    >>> After contemplating things for awhile, I think that feichanghong’s
    >>> idea is the right one after all: in each of the functions that switch
    >>> into hashtable->tempcxt, let's do a reset on the way out, as attached.
    
    >> I have considered this approach as well, but my concern is that "tempcxt"
    >> is not always an independent memory context. In some cases, it references
    >> another context — for example, in nodeSetOp.c’s "build_hash_table", “tempcxt"
    >> points to "setopstate->ps.ps_ExprContext->ecxt_per_tuple_memory". There is
    >> similar usage in nodeAgg.c as well. I’m not entirely sure that this approach would
    >> not discard data we still need, because the lifespan of
    >> "ps_ExprContext->ecxt_per_tuple_memory" seems to be longer than “tempcxt”.
    
    > Yes, I agree with that.
    
    Yeah, that is a fair point.  The existing API is that the caller is
    responsible for resetting tempcxt sufficiently often, and it looks
    like nodeSubplan.c is the only place that gets this wrong.  Let's
    just fix nodeSubplan.c, add a comment documenting this requirement,
    and call it good.
    
    >> Should we make tempcxt a completely independent memory context?
    
    > It looks fine. Perhaps we don't need to pass tempcxt to BuildTupleHashTable,
    > but rather create a new one within it. After each switch to tempcxt, we should
    > clean it up using MemoryContextReset.
    
    I thought about that too, but that would result in two short-lived
    contexts and two reset operations per tuple cycle where only one
    is needed.  I'm rather tempted to fix nodeSubplan.c by making it
    use innerecontext->ecxt_per_tuple_memory instead of a separate
    hash tmpcontext.  That context is getting reset already, at least in
    buildSubPlanHash().  That's probably too risky for the back branches
    but we could do it in HEAD.
    
    			regards, tom lane
    
    
    
    
  12. Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-08T20:46:10Z

    I wrote:
    > I thought about that too, but that would result in two short-lived
    > contexts and two reset operations per tuple cycle where only one
    > is needed.  I'm rather tempted to fix nodeSubplan.c by making it
    > use innerecontext->ecxt_per_tuple_memory instead of a separate
    > hash tmpcontext.  That context is getting reset already, at least in
    > buildSubPlanHash().  That's probably too risky for the back branches
    > but we could do it in HEAD.
    
    Concretely, I'm thinking about the attached.  0001 is the same
    logic as in the v02 patch, but I felt we could make the code
    be shorter and prettier instead of longer and uglier.  That's
    meant for back-patch, and then 0002 is for master only.
    
    			regards, tom lane
    
    
  13. 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Haiyang Li <mohen.lhy@alibaba-inc.com> — 2025-09-09T01:52:26Z

    On 2025-09-08 20:46:10 Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Concretely, I'm thinking about the attached. 0001 is the same
    > logic as in the v02 patch, but I felt we could make the code
    > be shorter and prettier instead of longer and uglier. That’s
    > meant for back-patch, and then 0002 is for master only.
    Yeah, v04-0001 and v04-0002 look good for me. 
    Thanks for refining the patch.
    —
    Thanks
    Haiyang Li
    
  14. Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    feichanghong <feichanghong@qq.com> — 2025-09-09T02:51:56Z

    
    > On Sep 9, 2025, at 04:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > 
    > I wrote:
    >> I thought about that too, but that would result in two short-lived
    >> contexts and two reset operations per tuple cycle where only one
    >> is needed.  I'm rather tempted to fix nodeSubplan.c by making it
    >> use innerecontext->ecxt_per_tuple_memory instead of a separate
    >> hash tmpcontext.  That context is getting reset already, at least in
    >> buildSubPlanHash().  That's probably too risky for the back branches
    >> but we could do it in HEAD.
    > 
    > Concretely, I'm thinking about the attached.  0001 is the same
    > logic as in the v02 patch, but I felt we could make the code
    > be shorter and prettier instead of longer and uglier.  That's
    > meant for back-patch, and then 0002 is for master only.
    > 
    > 			regards, tom lane
    
    
    The v04-0001 looks good for me. I am also considering whether there is a way to
    detect a memory leak. One preliminary idea is that for short-lived context such
    as the "Subplan HashTable Temp Context", we can assume that MemoryContextReset
    will be called frequently. Therefore, at the time of deletion, the memory usage
    should not be excessively large. Based on this assumption, we could implement
    the following check:
    ```
    diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
    index 15fa4d0a55e..56218cb6863 100644
    --- a/src/backend/utils/mmgr/mcxt.c
    +++ b/src/backend/utils/mmgr/mcxt.c
    @@ -166,6 +166,7 @@ static void MemoryContextStatsInternal(MemoryContext context, int level,
     static void MemoryContextStatsPrint(MemoryContext context, void *passthru,
     									const char *stats_string,
     									bool print_to_stderr);
    +static bool CheckMemoryContextLeak(MemoryContext context);
     
     /*
      * You should not do memory allocations within a critical section, because
    @@ -502,6 +503,7 @@ MemoryContextDeleteOnly(MemoryContext context)
     	Assert(context != CurrentMemoryContext);
     	/* All the children should've been deleted already */
     	Assert(context->firstchild == NULL);
    +	Assert(CheckMemoryContextLeak(context));
     
     	/*
     	 * It's not entirely clear whether 'tis better to do this before or after
    @@ -530,6 +532,24 @@ MemoryContextDeleteOnly(MemoryContext context)
     	VALGRIND_DESTROY_MEMPOOL(context);
     }
     
    +static bool
    +CheckMemoryContextLeak(MemoryContext context)
    +{
    +#ifdef MEMORY_CONTEXT_CHECKING
    +	if (!context->name)
    +		return true;
    +
    +	if (!strcmp(context->name, "Subplan HashTable Temp Context") == 0)
    +		return true;
    +
    +	/* The size of short-lived contexts should be kept under 1 MB. */
    +	if ((MemoryContextMemAllocated(context, false) > 1024 * 1024))
    +		return false;
    +#endif
    +	return true;
    +}
    +
    +
     /*
      * MemoryContextDeleteChildren
      *		Delete all the descendants of the named context and release all
    ```
    
    In debug mode, a memory leak can be easily detected with the following SQL.
    After applying v04-0001, it runs normally:
    ```sql
    with s as (
        select
            i::numeric as a
        from
            generate_series(1, 50000) i
    )
    select * from s where a not in (select * from s)
    ```
    
    For v04-0002, I checked the commit history, and before commit 133924e1,
    buildSubPlanHash was using ecxt_per_tuple_memory. It seems that the
    "Subplan HashTable Temp Context" was introduced in order to fix a certain bug.
    
    
    Best Regards,
    Fei Changhong
    
    
  15. Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-09T03:05:09Z

    feichanghong <feichanghong@qq.com> writes:
    > For v04-0002, I checked the commit history, and before commit 133924e1,
    > buildSubPlanHash was using ecxt_per_tuple_memory. It seems that the
    > "Subplan HashTable Temp Context" was introduced in order to fix a certain bug.
    
    It was a different ExprContext's ecxt_per_tuple_memory, though.
    This one is owned locally by the TupleHashTable.
    
    			regards, tom lane
    
    
    
    
  16. 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Haiyang Li <mohen.lhy@alibaba-inc.com> — 2025-09-09T07:30:48Z

    On 2025-09-09 03:05:09 Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > feichanghong <feichanghong@qq.com> writes:
    > > For v04-0002, I checked the commit history, and before commit 133924e1,
    > > buildSubPlanHash was using ecxt_per_tuple_memory. It seems that the
    > > "Subplan HashTable Temp Context" was introduced in order to fix a certain bug.
    > 
    > It was a different ExprContext's ecxt_per_tuple_memory, though.
    > This one is owned locally by the TupleHashTable.
    I checked buildSubPlanHash in nodeSubplan.c before commit 133924e1. AFAICS, the tempcxts
    are both referenced innerecontext->ecxt_per_tuple_memory in v04-0002 and before commit
    133924e1. They are same. 
    However, the changed behavior of TupleHashTableMatch introduced by commit
    bf6c614a (noted in [1]) may make the condition:
    ```
    However, the hashtable routines feel free to reset their temp context at any time, 
    which'd lead to destroying input data that was still needed.
    ```
    no longer holds true. Then,  the lifespan of tempcxt in buildHashtable is similar
    to that of innercontext->ecxt_per_tuple_memory, so it makes sense to merge the two,
    I think.
    BTW, I ran the test case supported in commit 133924e1 on version not contained commit
    133924e1 (tag REL8_0_26). I did not find any problems. But i can not find more information
    about this issue. 
    Just to be safe, I think we should verify this issue.
    [1] https://www.postgresql.org/message-id/160523.1757190713%40sss.pgh.pa.us <https://www.postgresql.org/message-id/160523.1757190713%40sss.pgh.pa.us >
    — 
    Thanks 
    Haiyang Li
    
  17. Re: 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-09T22:24:22Z

    "=?UTF-8?B?5p2O5rW35rSLKOmZjOeXlSk=?=" <mohen.lhy@alibaba-inc.com> writes:
    > For v04-0002, I checked the commit history, and before commit 133924e1,
    > buildSubPlanHash was using ecxt_per_tuple_memory. It seems that the
    > "Subplan HashTable Temp Context" was introduced in order to fix a certain bug.
    
    You're quite right that 0002 looks suspiciously like it's reverting
    133924e1.  However, it doesn't cause the test case added by that
    commit to fail (even under Valgrind).  I think the reason is what
    you say next:
    
    > However, the changed behavior of TupleHashTableMatch introduced by commit
    > bf6c614a (noted in [1]) may make the condition:
    > ```
    > However, the hashtable routines feel free to reset their temp context at any time, 
    > which'd lead to destroying input data that was still needed.
    > ```
    > no longer holds true. Then,  the lifespan of tempcxt in buildHashtable is similar
    > to that of innercontext->ecxt_per_tuple_memory, so it makes sense to merge the two,
    > I think.
    
    Looking around in 133924e1^, the only place in execGrouping.c that
    would reset the hashtable's tempcxt was execTuplesMatch (which was
    called by TupleHashTableMatch).  But bf6c614a removed execTuplesMatch.
    The modern execGrouping.c code resets hashtable->tempcxt nowhere.
    Instead, TupleHashTableMatch applies ExecQualAndReset to
    hashtable->exprcontext, so that what is reset is the
    ecxt_per_tuple_memory of that econtext --- but that's manufactured by
    CreateStandaloneExprContext in BuildTupleHashTable, and has no
    connection at all to any context that nodeSubplan.c will touch.
    It is certainly not the same ecxt_per_tuple_memory that pre-133924e1
    was resetting.
    
    So basically I'm saying that bf6c614a made it okay to revert
    133924e1.
    
    > BTW, I ran the test case supported in commit 133924e1 on version not contained commit
    > 133924e1 (tag REL8_0_26). I did not find any problems. But i can not find more information
    > about this issue. 
    
    Digging in the archives, I found the discussion leading to 133924e1 at
    https://www.postgresql.org/message-id/flat/i2jnbo%241lcj%241%40news.hub.org
    
    As for trying it on 8.0.26, the commit message for 133924e1 says
    specifically that the problem isn't there pre-8.1.  I did try to
    duplicate your test using 133924e1^, and it didn't fail for me either.
    But I'm not too excited about that, because building PG versions this
    old on modern platforms is really hard.  I had to compile with -O0
    to get a build that worked at all, and that's already a significant
    difference from the code we would have been testing back in 2010.
    It may be that the reason for non-reproduction is buried somewhere in
    that, or in the different compiler toolchain.  I'm not sure it's worth
    a lot of effort to dig deeply into the details.
    
    			regards, tom lane
    
    
    
    
  18. 回复:回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Haiyang Li <mohen.lhy@alibaba-inc.com> — 2025-09-10T01:55:52Z

    On 2025-09-09 22:24:22 Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Digging in the archives, I found the discussion leading to 133924e1 at
    > https://www.postgresql.org/message-id/flat/i2jnbo%241lcj%241%40news.hub.org <https://www.postgresql.org/message-id/flat/i2jnbo%241lcj%241%40news.hub.org >
    Having understood the background of this issue, and given that v04-0002 only applies to master, 
    v04-0002 LGTM, again.
    Thanks for your analysis. 
    —
    Thanks
    Haiyang Li
    
  19. Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    feichanghong <feichanghong@qq.com> — 2025-09-10T02:32:00Z

    
    > On Sep 10, 2025, at 09:55, 李海洋(陌痕) <mohen.lhy@alibaba-inc.com> wrote:
    > 
    > On 2025-09-09 22:24:22 Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > 
    > > Digging in the archives, I found the discussion leading to 133924e1 at
    > > https://www.postgresql.org/message-id/flat/i2jnbo%241lcj%241%40news.hub.org
    > 
    > 
    > Having understood the background of this issue, and given that v04-0002 only applies to master, 
    > v04-0002 LGTM, again.
    > 
    > Thanks for your analysis. 
    
    Thanks for the analysis! v04-0002 looks fine to me — please just apply it on
    HEAD.
    
    Best Regards,
    Fei Changhong
    
    
  20. Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-10T20:18:05Z

    feichanghong <feichanghong@qq.com> writes:
    > Thanks for the analysis! v04-0002 looks fine to me — please just apply it on
    > HEAD.
    
    Done that way.
    
    I thought a bit about your suggestion of adding some kind of
    assertion check for memory leaks, but it looks too messy and
    specialized as-is.  Maybe with reflection we can find a more
    generic idea.
    
    			regards, tom lane