Re: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
feichanghong <feichanghong@qq.com>
From: feichanghong <feichanghong@qq.com>
To: ocean_li_996 <ocean_li_996@163.com>
Cc: "mohen.lhy@alibaba-inc.com" <mohen.lhy@alibaba-inc.com>,
"pgsql-bugs@lists.postgresql.org" <pgsql-bugs@lists.postgresql.org>,
jdavis@postgresql.org
Date: 2025-09-03T02:20:48Z
Lists: pgsql-bugs
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Eliminate duplicative hashtempcxt in nodeSubplan.c.
- bdc6cfcd12f5 19 (unreleased) landed
-
Fix memory leakage in nodeSubplan.c.
- e1da9c072106 16.11 landed
- bc865ff6d1f0 18.0 landed
- abdeacdb0920 19 (unreleased) landed
- 8b6c29afd125 13.23 landed
- 862980f924ff 17.7 landed
- 5eab9b0a473e 14.20 landed
- 5ac973892b09 15.15 landed
-
Do execGrouping.c via expression eval machinery, take two.
- bf6c614a2f2c 11.0 cited
-
Fix potential failure when hashing the output of a subplan that produces
- 133924e13e00 9.1.0 cited
Hi, > On Sep 3, 2025, at 09:43, ocean_li_996 <ocean_li_996@163.com> wrote: > I've attached 'v01_fix_memory_leak_in_hashed_subplan_node.patch' to address this. It seems this issue has been around for many years. I took a quick look at the patch for fixing it. Why don't we reset the temp context in the LookupTupleHashEntry, TupleHashTableHash, LookupTupleHashEntryHash, and FindTupleHashEntry functions? This seems more robust. Furthermore, the added test case doesn't seem to detect whether there's a memory leak. > > ----- Original Message ----- > From: "PG Bug reporting form" <noreply@postgresql.org> > To: pgsql-bugs@lists.postgresql.org > Cc: mohen.lhy@alibaba-inc.com > Sent: Tue, 02 Sep 2025 15:58:49 +0000 > Subject: BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset > > The following bug has been logged on the website: > > Bug reference: 19040 > Logged by: haiyang li > Email address: mohen.lhy@alibaba-inc.com > PostgreSQL version: 18beta3 > Operating system: centos7 5.10.84 x86_64 > Description: > > Hello, all! > > I found a query which consumes a lot of memory and triggers OOM killer. > Memory leak occurs in hashed subplan node. > > I was able to create reproducible test case on machine with default config > and postgresql 18beta3: > > CREATE TABLE test1( > a numeric, > b int); > INSERT INTO > test1 > SELECT > i, > i > FROM > generate_series(1, 30000000) i; -- Make the running time longer > EXPLAIN ANALYZE SELECT > * > FROM > test1 > WHERE > a NOT IN( > SELECT > i > FROM > generate_series(1, 10000) i > ); > > plan: > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------ > Seq Scan on test1 (cost=125.00..612432.24 rows=15000108 width=10) (actual > time=135.191..25832.808 rows=29990000 loops=1) > Filter: (NOT (hashed SubPlan 1)) > Rows Removed by Filter: 10000 > SubPlan 1 > -> Function Scan on generate_series i (cost=0.00..100.00 rows=10000 > width=4) (actual time=36.999..38.296 rows=10000 loops=1) > Planning Time: 0.280 ms > JIT: > Functions: 15 > Options: Inlining true, Optimization true, Expressions true, Deforming > true > Timing: Generation 1.155 ms, Inlining 25.929 ms, Optimization 60.700 ms, > Emission 23.018 ms, Total 110.802 ms > Execution Time: 28217.026 ms > (11 rows) > > > I observed that the process's RES (resident memory) was increasing rapidly > during SQL execution by using 'top -p <pid>' command. > > Furthermore, during SQL execution, I ran 'select > pg_log_backend_memory_contexts(<pid>)' > to print memory context statistics. The context with abnormally high memory > usage was > "Subplan HashTable Temp Context." The key part of the log is as follows: > > ... > LOG: level: 5; Subplan HashTable Temp Context: 514834432 total in 62849 > blocks; 973712 free (60695 chunks); 513860720 used > LOG: level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808 > free (5 chunks); 400480 used > ... > Grand total: 518275344 bytes in 63116 blocks; 2025560 free (60976 chunks); > 516249784 used > ... > > If I change the SQL from "a NOT IN" to "b NOT IN" and do the same action, I > can not > observe abnormally high memory usage. Likewise, the key part of the log is > as follows: > ... > LOG: level: 5; Subplan HashTable Temp Context: 1024 total in 1 blocks; 784 > free (0 chunks); 240 used > LOG: level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808 > free (5 chunks); 400480 used > ... > Grand total: 3441936 bytes in 268 blocks; 1050520 free (281 chunks); 2391416 > used > ... > > While analyzing the source code, I found that the hashed subplan node fails > to reset > the 'hashtempcxt' context after probing the hash table for each slot. > > When variable-length datatypes (e.g., numeric) are processed, this can > trigger calls > to 'detoast_attr', which allocate memory in hashtempcxt. Without a reset, > this memory > is not reclaimed until the context itself is destroyed, resulting in a > memory leak > when processing large numbers of slots. Additionally, through testing, I've found that if test1 is a partitioned table, multiple "Subplan HashTable Context" instances exist, and these MemoryContexts will only be released after execution is complete. If the number of subpartitions is large, it can lead to significant memory usage. Doesn't this differ from what the execution plan shows? The plan only displays one occurrence of SubPlan. Best Regards, Fei Changhong