Thread
-
lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Chao Li <li.evan.chao@gmail.com> — 2025-12-29T09:21:26Z
Hi Hackers, I noticed this problem while reviewing Tom’s patch [1]. In lsyscache.c, there are multiple places that get IndexAmRoutine by GetIndexAmRoutineByAmId(). Only one function get_opmethod_canorder() pfree the returned object, while the other 3 functions get_op_index_interpretation(), equality_ops_are_compatible() and comparison_ops_are_compatible() all call GetIndexAmRoutineByAmId() in for loops but without freeing the memory. While these paths are not hot enough to cause leaks that matter in practice, I think for consistency, we should free the memory. [1] https://postgr.es/m/2380165.1766871097@sss.pgh.pa.us Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-29T16:05:02Z
Chao Li <li.evan.chao@gmail.com> writes: > I noticed this problem while reviewing Tom’s patch [1]. In lsyscache.c, > there are multiple places that get IndexAmRoutine > by GetIndexAmRoutineByAmId(). Only one function get_opmethod_canorder() > pfree the returned object, while the other 3 > functions get_op_index_interpretation(), equality_ops_are_compatible() > and comparison_ops_are_compatible() all call GetIndexAmRoutineByAmId() in > for loops but without freeing the memory. > While these paths are not hot enough to cause leaks that matter in > practice, I think for consistency, we should free the memory. I wonder if it wouldn't be better to rethink the convention that IndexAmRoutine structs are palloc'd. Is there any AM for which the contents aren't constant, so that we could just return a pointer to a static constant struct and save the palloc/pfree overhead? We'd want to const-ify the result type of GetIndexAmRoutine, and maybe that'd result in more notational churn than it's worth, but it seems like we're missing a bet. (I would not have proposed this before we started using C99 struct initializers. But now that we can, it doesn't seem like writing out the struct contents as an initializer would be too painful.) regards, tom lane
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Matthias van de Meent <boekewurm+postgres@gmail.com> — 2025-12-29T18:36:39Z
On Mon, 29 Dec 2025 at 17:05, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Chao Li <li.evan.chao@gmail.com> writes: > > I noticed this problem while reviewing Tom’s patch [1]. In lsyscache.c, > > there are multiple places that get IndexAmRoutine > > by GetIndexAmRoutineByAmId(). Only one function get_opmethod_canorder() > > pfree the returned object, while the other 3 > > functions get_op_index_interpretation(), equality_ops_are_compatible() > > and comparison_ops_are_compatible() all call GetIndexAmRoutineByAmId() in > > for loops but without freeing the memory. > > > While these paths are not hot enough to cause leaks that matter in > > practice, I think for consistency, we should free the memory. > > I wonder if it wouldn't be better to rethink the convention that > IndexAmRoutine structs are palloc'd. Is there any AM for which > the contents aren't constant, so that we could just return a pointer > to a static constant struct and save the palloc/pfree overhead? I'm not aware of any such AMs, and the only reason I can think of to change this dynamically is for specialization: the amroutine itself doesn't get sufficient information to return a specialized IndexAmRoutine pointer; and assuming that rd_indam itself isn't `const`-ified the specializing code would just have to change the pointed-to IndexAmRoutine instead of replacing individual am* functions in the struct. Requiring `const static` for IndexAMRoutine would make it more complicated to do JIT for index AMs correctly and without resource leaks, but such a feature would probably require more resource management hooks than are currently available to extensions, so I don't think we'll lose much there. > We'd want to const-ify the result type of GetIndexAmRoutine, and > maybe that'd result in more notational churn than it's worth, > but it seems like we're missing a bet. > (I would not have proposed this before we started using C99 > struct initializers. But now that we can, it doesn't seem > like writing out the struct contents as an initializer would > be too painful.) Yes, let's do it. Here's my patch that does this, pulled from [0]. It was originally built to reduce the per-index catcache overhead; as using static const pointers reduces the data allocated into the "index info" context by 264 bytes. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com) [0]: https://www.postgresql.org/message-id/CAEze2Wi7tDidbDVJhu=Pstb2hbUXDCxx_VAZnKSqbTMf7k8+uQ@mail.gmail.com
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-29T19:01:03Z
Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > On Mon, 29 Dec 2025 at 17:05, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I wonder if it wouldn't be better to rethink the convention that >> IndexAmRoutine structs are palloc'd. Is there any AM for which >> the contents aren't constant, so that we could just return a pointer >> to a static constant struct and save the palloc/pfree overhead? > I'm not aware of any such AMs, and the only reason I can think of to > change this dynamically is for specialization: the amroutine itself > doesn't get sufficient information to return a specialized > IndexAmRoutine pointer; and assuming that rd_indam itself isn't > `const`-ified the specializing code would just have to change the > pointed-to IndexAmRoutine instead of replacing individual am* > functions in the struct. Yeah, it's hard to credit any need for per-call specialization given that the amhandler receives no parameters. I can imagine per-session specialization, for instance as a result of wanting to JIT-compile the AM's methods. But you could still do that: on first call, build one copy of the IndexAMRoutine struct in a long-lived context, and then just keep returning pointers to that struct. The "const" requirement applies to the core code's references to the IndexAMRoutine struct, it does not constrain the code creating such a struct. > Here's my patch that does this, pulled from [0]. It was originally > built to reduce the per-index catcache overhead; as using static const > pointers reduces the data allocated into the "index info" context by > 264 bytes. Cool, I forgot you'd already been poking at this. The patch is kinda long, but not as bad as I feared, and it all looks quite mechanical. It lacks documentation updates though. regards, tom lane
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-29T19:17:25Z
... oh, one other thought: instead of what you did in InitIndexAmRoutine, we should probably do something like { MemoryContext oldcontext; /* * We formerly specified that the amhandler should return a * palloc'd struct. That's now deprecated in favor of returning * a pointer to a static struct, but to avoid completely breaking * old external AMs, run the amhandler in the relation's rd_indexcxt. */ oldcontext = MemoryContextSwitchTo(relation->rd_indexcxt); relation->rd_indam = GetIndexAmRoutine(relation->rd_amhandler); MemoryContextSwitchTo(oldcontext); } regards, tom lane -
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Matthias van de Meent <boekewurm+postgres@gmail.com> — 2025-12-29T20:04:53Z
On Mon, 29 Dec 2025 at 20:01, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Matthias van de Meent <boekewurm+postgres@gmail.com> writes >> Here's my patch that does this, pulled from [0]. It was originally >> built to reduce the per-index catcache overhead; as using static const >> pointers reduces the data allocated into the "index info" context by >> 264 bytes. > > Cool, I forgot you'd already been poking at this. The patch > is kinda long, but not as bad as I feared, and it all looks > quite mechanical. It lacks documentation updates though. Attached v2, in which I've updated the one place where I could find IndexAmRoutine's allocation policy being described, and in which I've also updated InitIndexAmRoutine with the suggested changes of your other mail. Thanks! Kind regards, Matthias van de Meent Databricks (https://www.databricks.com)
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-29T22:55:29Z
Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > Attached v2, in which I've updated the one place where I could find > IndexAmRoutine's allocation policy being described, and in which I've > also updated InitIndexAmRoutine with the suggested changes of your > other mail. Thanks! I went through this and made some mostly-cosmetic changes. I think the attached v3 is ready to go, but I'll wait a day or so to see if anyone has any comments. regards, tom lane
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Chao Li <li.evan.chao@gmail.com> — 2025-12-29T23:21:38Z
> On Dec 30, 2025, at 04:04, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > On Mon, 29 Dec 2025 at 20:01, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> Matthias van de Meent <boekewurm+postgres@gmail.com> writes >>> Here's my patch that does this, pulled from [0]. It was originally >>> built to reduce the per-index catcache overhead; as using static const >>> pointers reduces the data allocated into the "index info" context by >>> 264 bytes. >> >> Cool, I forgot you'd already been poking at this. The patch >> is kinda long, but not as bad as I feared, and it all looks >> quite mechanical. It lacks documentation updates though. > > Attached v2, in which I've updated the one place where I could find > IndexAmRoutine's allocation policy being described, and in which I've > also updated InitIndexAmRoutine with the suggested changes of your > other mail. Thanks! > > Kind regards, > > Matthias van de Meent > Databricks (https://www.databricks.com) > <v2-0001-Stop-allocating-one-IndexAmRoutine-for-every-inde.patch> I’m glad that my finding helped move this patch forward. After reviewing v2, I think my patch can be completely superseded by yours, which is fine with me. I have a couple of comments on v2. 1 - amapi.c ``` /* * GetIndexAmRoutine - call the specified access method handler routine to get * its IndexAmRoutine struct, which will be palloc'd in the caller's context. * * Note that if the amhandler function is built-in, this will not involve * any catalog access. It's therefore safe to use this while bootstrapping * indexes for the system catalogs. relcache.c relies on that. */ const IndexAmRoutine * GetIndexAmRoutine(Oid amhandler) { Datum datum; IndexAmRoutine *routine; ``` * The function header comment needs an update, it still talks about “palloc”, now it should say something like “returned structure should not be free-ed”. * The local variable “routine” now can be “const” as well. 2 - relcache.c ``` InitIndexAmRoutine(Relation relation) { - IndexAmRoutine *cached, - *tmp; + MemoryContext oldctx; /* - * Call the amhandler in current, short-lived memory context, just in case - * it leaks anything (it probably won't, but let's be paranoid). + * We formerly specified that the amhandler should return a + * palloc'd struct. That's now deprecated in favor of returning + * a pointer to a static struct, but to avoid completely breaking + * old external AMs, run the amhandler in the relation's rd_indexcxt. */ - tmp = GetIndexAmRoutine(relation->rd_amhandler); - - /* OK, now transfer the data into relation's rd_indexcxt. */ - cached = (IndexAmRoutine *) MemoryContextAlloc(relation->rd_indexcxt, - sizeof(IndexAmRoutine)); - memcpy(cached, tmp, sizeof(IndexAmRoutine)); - relation->rd_indam = cached; - - pfree(tmp); + oldctx = MemoryContextSwitchTo(relation->rd_indexcxt); + relation->rd_indam = GetIndexAmRoutine(relation->rd_amhandler); + MemoryContextSwitchTo(oldctx); } ``` I understand that switching to rd_indexcxt is intended to mitigate the risk that external index AMs might still palloc the returned IndexAmRoutine. However, I don’t see a way to actually enforce external AMs to always return a static pointer. In particular, if an AM switches to a different memory context internally, the MemoryContextSwitchTo() here would not help. I’m not sure whether we want to go further than the current approach, but it seems that fully eliminating the risk would require detecting dynamically allocated results and copying them into rd_indexcxt, which doesn’t appear easy or robust to implement in practice. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/ -
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Chao Li <li.evan.chao@gmail.com> — 2025-12-29T23:26:43Z
On Tue, Dec 30, 2025 at 6:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > > Attached v2, in which I've updated the one place where I could find > > IndexAmRoutine's allocation policy being described, and in which I've > > also updated InitIndexAmRoutine with the suggested changes of your > > other mail. Thanks! > > I went through this and made some mostly-cosmetic changes. > I think the attached v3 is ready to go, but I'll wait a day > or so to see if anyone has any comments. > > regards, tom lane > > I just saw v3 after sending the review comments for v2. Looks like my comment 1 has been addressed in v3, and the comment 2 still stands. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Matthias van de Meent <boekewurm+postgres@gmail.com> — 2025-12-29T23:43:12Z
On Tue, 30 Dec 2025 at 00:22, Chao Li <li.evan.chao@gmail.com> wrote: > I understand that switching to rd_indexcxt is intended to mitigate the risk that external index AMs might still palloc the returned IndexAmRoutine. > > However, I don’t see a way to actually enforce external AMs to always return a static pointer. In particular, if an AM switches to a different memory context internally, the MemoryContextSwitchTo() here would not help. Yes, sadly it's quite difficult to determine whether something is statically allocated. Unlike with palloc'd objects, you can't rely on an always-available header in assert-enabled builds; the best we can do introspection into which OS memory area this allocation is placed; which encroaches on ASan and Valgrind's featureset --- and even that is not necessarily sufficient, as not all compilers may put `static const` -equivalent objects into knowable memory locations. > I’m not sure whether we want to go further than the current approach, but it seems that fully eliminating the risk would require detecting dynamically allocated results and copying them into rd_indexcxt, which doesn’t appear easy or robust to implement in practice. Yes, I don't think there is much more we can do here to protect index AM implementations against this change within Postgres' own tooling without introducing address space detection features more reasonably found in ASan or Valgrind. So I think this is sufficient as a best-effort approach. Kind regards, Matthias van de Meent
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-29T23:50:42Z
Chao Li <li.evan.chao@gmail.com> writes: > I understand that switching to rd_indexcxt is intended to mitigate the risk that external index AMs might still palloc the returned IndexAmRoutine. > However, I don’t see a way to actually enforce external AMs to always return a static pointer. In particular, if an AM switches to a different memory context internally, the MemoryContextSwitchTo() here would not help. I do not think we need to "enforce" that, and as you say it'd be quite hard to do so. The point of this MemoryContextSwitchTo is just to allow existing AMs that naively do palloc() as they were told will continue to work (modulo an increased chance of memory-leak issues since we removed some pfree's). If we don't do the switching, a non-updated AM will fail in very hard-to-diagnose ways once one of its rd_indam pointers becomes dangling because the context that was active at relcache-entry creation time has gone away. I think it's worth a small number of cycles to save external AM authors from having that debugging experience. I did test this, by dint of installing all the changes except the ones in the amhandler functions. That still passed check-world. I didn't attempt to analyze whether there were any new leaks of any significance. The main objection that could be raised to this is that the old coding ensures that any memory leaked during GetIndexAmRoutine() will be leaked in the expected-to-be-short-lived caller's context, but now it'd be leaked in the cache's rd_indexcxt. I don't believe that fmgr_info can leak any memory when calling a built-in function, but for extension functions there will be a syscache lookup and that could potentially have some incidental leaks. All in all though, I think this is a good tradeoff. We could perhaps remove those MemoryContextSwitchTos in a few years when we think everybody's updated their index AMs. regards, tom lane
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Álvaro Herrera <alvherre@kurilemu.de> — 2025-12-30T14:15:26Z
On 2025-Dec-29, Tom Lane wrote: > The main objection that could be raised to this is that the old coding > ensures that any memory leaked during GetIndexAmRoutine() will be > leaked in the expected-to-be-short-lived caller's context, but now > it'd be leaked in the cache's rd_indexcxt. One thing we can perhaps do is (in assert-enabled builds) to detect whether memory usage for that context has increased during InitIndexAmRoutine and raise a warning if so. Then extension authors would realize this and have a chance to fix it promptly. I tried this out and it doesn't work as well as I thought, due to how AllocSet works: we don't get a difference in the amount of allocated memory (thus no WARNING) unless we add some padding bytes to IndexAmRoutine, as shown below (of course, just POC-quality -- didn't kry to compile without assertions). Given this, I'm not terribly optimistic about this idea. I tested this by adding palloc(sizeof(IndexAmRoutine)) in brinhandler() and verifying that a few regression tests fail with the added warning message. diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 88259f7c228..75f2a2f5e37 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -1421,6 +1421,7 @@ static void InitIndexAmRoutine(Relation relation) { MemoryContext oldctx; + Size allocated PG_USED_FOR_ASSERTS_ONLY; /* * We formerly specified that the amhandler should return a palloc'd @@ -1429,7 +1430,19 @@ InitIndexAmRoutine(Relation relation) * the amhandler in the relation's rd_indexcxt. */ oldctx = MemoryContextSwitchTo(relation->rd_indexcxt); + +#ifdef USE_ASSERT_CHECKING + allocated = MemoryContextMemAllocated(CurrentMemoryContext, false); +#endif + relation->rd_indam = GetIndexAmRoutine(relation->rd_amhandler); + +#ifdef USE_ASSERT_CHECKING + if (MemoryContextMemAllocated(CurrentMemoryContext, false) - allocated != 0) + elog(WARNING, "memory allocated while initializing access method %u (was %zu, now %zu)", + relation->rd_amhandler, allocated, MemoryContextMemAllocated(CurrentMemoryContext, false)); +#endif + MemoryContextSwitchTo(oldctx); } diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h index 278da36bc08..78fcbcffc14 100644 --- a/src/include/access/amapi.h +++ b/src/include/access/amapi.h @@ -322,6 +322,11 @@ typedef struct IndexAmRoutine /* interface functions to support planning */ amtranslate_strategy_function amtranslatestrategy; /* can be NULL */ amtranslate_cmptype_function amtranslatecmptype; /* can be NULL */ + +#ifdef USE_ASSERT_CHECKING + char pad[512]; +#endif + } IndexAmRoutine; -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ -
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Matthias van de Meent <boekewurm+postgres@gmail.com> — 2025-12-30T14:41:03Z
On Tue, 30 Dec 2025 at 15:15, Álvaro Herrera <alvherre@kurilemu.de> wrote: > > On 2025-Dec-29, Tom Lane wrote: > > > The main objection that could be raised to this is that the old coding > > ensures that any memory leaked during GetIndexAmRoutine() will be > > leaked in the expected-to-be-short-lived caller's context, but now > > it'd be leaked in the cache's rd_indexcxt. > > One thing we can perhaps do is (in assert-enabled builds) to detect > whether memory usage for that context has increased during > InitIndexAmRoutine and raise a warning if so. Then extension authors > would realize this and have a chance to fix it promptly. > > I tried this out and it doesn't work as well as I thought, due to how > AllocSet works: we don't get a difference in the amount of allocated > memory (thus no WARNING) unless we add some padding bytes to > IndexAmRoutine Hmm, wouldn't we be able to detect changes in MemoryContextMemConsumed(ctx, counters) with one before and one after GetIndexAmRoutine(), such as included below? It would cause false positives if amroutine() does memory-related work other than just returning an appropriate IndexAmRoutine pointer (if it does so without switching to its own MemoryContext), but I don't think that such false positives here are necessarily bad - the AM shouldn't be doing much at this point in the code. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com) diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 86b90765433..c17f9c3e53a 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -1421,6 +1421,13 @@ static void InitIndexAmRoutine(Relation relation) { MemoryContext oldctx; +#if USE_ASSERT_CHECKING + Size prevusedmem; + MemoryContextCounters usage; + + MemoryContextMemConsumed(relation->rd_indexcxt, &usage); + prevusedmem = usage.totalspace - usage.freespace; +#endif /* * We formerly specified that the amhandler should return a @@ -1431,6 +1438,12 @@ InitIndexAmRoutine(Relation relation) oldctx = MemoryContextSwitchTo(relation->rd_indexcxt); relation->rd_indam = GetIndexAmRoutine(relation->rd_amhandler); MemoryContextSwitchTo(oldctx); + +#if USE_ASSERT_CHECKING + /* ensure the index routine was not palloc'd */ + MemoryContextMemConsumed(relation->rd_indexcxt, &usage); + Assert(prevusedmem == (usage.totalspace - usage.freespace)); +#endif } /* ``` -
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-30T15:25:00Z
Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > On Tue, 30 Dec 2025 at 15:15, Álvaro Herrera <alvherre@kurilemu.de> wrote: >> One thing we can perhaps do is (in assert-enabled builds) to detect >> whether memory usage for that context has increased during >> InitIndexAmRoutine and raise a warning if so. Then extension authors >> would realize this and have a chance to fix it promptly. > Hmm, wouldn't we be able to detect changes in > MemoryContextMemConsumed(ctx, counters) with one before and one after > GetIndexAmRoutine(), such as included below? I don't think we can do this, because there are effects that the amhandler doesn't have control over. In particular, if we have to load its pg_proc row into syscache during fmgr_info, I don't think that is positively guaranteed not to leak anything. (This isn't a factor for built-in AMs, which will take the fast path in fmgr_info, but it will be an issue for extensions.) I am not terribly concerned by one-time leaks of that sort, so I don't really feel an urge to try to complain about them. regards, tom lane
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Matthias van de Meent <boekewurm+postgres@gmail.com> — 2025-12-30T18:03:30Z
On Tue, 30 Dec 2025 at 16:25, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > > On Tue, 30 Dec 2025 at 15:15, Álvaro Herrera <alvherre@kurilemu.de> wrote: > >> One thing we can perhaps do is (in assert-enabled builds) to detect > >> whether memory usage for that context has increased during > >> InitIndexAmRoutine and raise a warning if so. Then extension authors > >> would realize this and have a chance to fix it promptly. > > > Hmm, wouldn't we be able to detect changes in > > MemoryContextMemConsumed(ctx, counters) with one before and one after > > GetIndexAmRoutine(), such as included below? > > I don't think we can do this, because there are effects that the > amhandler doesn't have control over. In particular, if we have to > load its pg_proc row into syscache during fmgr_info, I don't think > that is positively guaranteed not to leak anything. (This isn't > a factor for built-in AMs, which will take the fast path in > fmgr_info, but it will be an issue for extensions.) > > I am not terribly concerned by one-time leaks of that sort, so > I don't really feel an urge to try to complain about them. If it's difficult to filter out one-time leaks into the context caused by e.g. fmgr infra, then -indeed- it's probably not worth the effort. In which case, v3 LGTM. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com)
-
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-30T23:56:43Z
Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > On Tue, 30 Dec 2025 at 16:25, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I am not terribly concerned by one-time leaks of that sort, so >> I don't really feel an urge to try to complain about them. > If it's difficult to filter out one-time leaks into the context caused > by e.g. fmgr infra, then -indeed- it's probably not worth the effort. > In which case, v3 LGTM. Pushed then. We can always tweak it if we think of a better answer. I did spend a bit of time thinking about this sketch: 1. Invent a memory context support method along the lines of bool ContextContainsPointer(MemoryContext cxt, const void *ptr) that returns "true" if ptr points anywhere within the context's owned memory space. The implementation I have in mind is just to scan the context's list of blocks and see if ptr >= block_start && ptr < block_end for any of them. This dodges problems like whether we can tell if the pointer points at a valid chunk. We aren't promising that, only that the pointer points somewhere in that memory area. 2. Remove the MemoryContextSwitchTo that's now in InitIndexAmRoutine, reverting to running the amhandler in the caller's context. Instead, in an assert-enabled build, throw error if ContextContainsPointer(CurrentMemoryContext, relation->rd_indam). This definitely will throw error if the amhandler just did a palloc(), and it definitely will not if the amhandler returned a pointer to a static variable. It won't throw error if the amhandler returned a pointer to malloc'd space (which is fine now, but would have been an error before) or space palloc'd from a different context. In those cases we have to assume the amhandler knows what it's doing. However, I feel slightly queasy about this idea anyway, because relcache.c doesn't really have a lot of business assuming what CurrentMemoryContext it's called in. Consider an extension AM that has a non-constant IndexAmRoutine (maybe it wants to point to JITted methods) and palloc's that in some suitably long-lived context. Is there any way in which we could reach InitIndexAmRoutine while running in that same long-lived context? This'd likely require a recursive index_open call while the AM is building its result, which is not all that hard to credit if the AM is trying to invoke JIT or the like. However any such index_opens are probably for system catalogs, which are not likely to be using extension AMs, so maybe the case can't be reached after all. Nonetheless it doesn't seem quite impossible to have a false positive, especially if the AM chooses a common context like CacheMemoryContext or TopMemoryContext for this purpose. Another objection is the tedium of writing all those ContextContainsPointer methods. I suppose that (at least for now) we could stub out all the ones except aset.c's. Anyway, I don't find this idea quite attractive enough to pursue, but maybe somebody else will think differently. It'd be more attractive if we had additional use-cases for ContextContainsPointer. regards, tom lane