Thread
-
Re: TRAP: failed Assert("outerPlan != NULL") in postgres_fdw.c
Etsuro Fujita <etsuro.fujita@gmail.com> — 2025-08-22T10:59:40Z
On Wed, Aug 6, 2025 at 8:30 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: > On Wed, Aug 6, 2025 at 3:01 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > The attached patch includes the draft fix and regression tests (using > > injection points). > > > > I don't have enough experience with the planner and FDW code area to > > evaluate whether the patch fixes the issue in the right approach. > > Feedback is very welcome. I've confirmed this assertion could happen > > with the same scenario on all supported branches. > > Will review. Thank you for the report and patch! First, my apologies for the delay. I reviewed the postgres_fdw.c part of the fix: @@ -6313,9 +6314,18 @@ postgresGetForeignJoinPaths(PlannerInfo *root, * calling foreign_join_ok(), since that function updates fpinfo and marks * it as pushable if the join is found to be pushable. */ - if (root->parse->commandType == CMD_DELETE || - root->parse->commandType == CMD_UPDATE || - root->rowMarks) + for (PlannerInfo *proot = root; proot != NULL; proot = proot->parent_root) + { + if (proot->parse->commandType == CMD_DELETE || + proot->parse->commandType == CMD_UPDATE || + proot->rowMarks) + { + need_epq = true; + break; + } + } + + if (need_epq) { epq_path = GetExistingLocalJoinPath(joinrel); if (!epq_path) I think this successfully avoids the assertion failure and produces the correct result, but sorry, I don't think it's the right way to go. I think the root cause of this issue is in the EPQ handling of foreign/custom joins in ExecScanFetch() in execScan.h: it runs the recheck method function for a given foreign/custom join whenever it is called for EPQ rechecking, but that is not 100% correct. I think the correct handling is: run the recheck method function for the join if it is called for EPQ rechecking and the join is a *descendant* join in the EPQ plan tree; otherwise run the access method function for the join even if it is called for EPQ rechecking, like the attached (where I used the epqParam of the given EPQState to determine whether the join is a descendant join or not, which localizes the fix pretty well). For the SELECT FOR UPDATE query shown upthread, when doing an EPQ recheck, the fix evaluates the sub-select expression in the target list by doing a remote join, not a local join, so it would work more efficiently than the fix you proposed. Sorry, I have not yet reviewed the isolation test part. Will do next. Best regards, Etsuro Fujita