Thread
-
Re: Parallel INSERT SELECT take 2
Tomas Vondra <tomas@vondra.me> — 2026-05-17T11:24:29Z
On 5/11/26 13:44, Tomas Vondra wrote: > ... > >>> >>>>> alternative idea >>>>> ---------------- >>>>> >>>>> What if we took a very different approach, and just made sure the INSERT >>>>> part never runs concurrently with the SELECT? Say we fully materialize the >>>>> SELECT result (e.g. write it into a tuplestore), finish the parallel mode, and >>>>> only then do the INSERT? >>>>> ... I've been wondering about aspects of this approach, particularly about item (b) in my message, i.e. why we start/end the parallel mode ExecutePlan and not in the Gather node. Even if acquiring the XID before entering the parallel mode works, it's a bit annoying. So I asked about it on our Discord, and there was a little bit discussion, with the explanation that it's difficult to predict how exactly will the nodes interleave, and with doing this in Gather the nodes above might observe various values depending on the plan. At least that's how I understand/interpret the discussion. Of course, with the materialization this would not be an issue (I think). But Tom also mentioned one additional issue for parallelizing DML this way, and that's EvalPlanQual (EPQ), i.e. the mechanism we use to recheck concurrently updated rows in READ COMMITTED isolation mode. I don't see EPQ mentioned in these threads about parallelizing INSERT + SELECT at all. I suppose that's because EPQ is an issue only for UPDATE/DELETE queries, not for INSERT. Maybe everyone was aware of it, which is why the effort was limited to INSERT, not sure? The question is - is this acceptable? Did people expect to eventually extend the INSERT solution to UPDATE/DELETE in the future? Because the materialization does not really allow that, I think. Is it worth it, or would it be something we regret later? Not sure. In a way, users can already do exactly what the materialization does by CREATE UNLOGGED TABLE tmp AS SELECT ...; INSERT INTO t SELECT * FROM tmp; and they could even do something like this for UPDATE/DELETE, except that it (of course) won't do the EPQ recheck against relations in the original query. I'm not sure if we need to do something about EPQ. If it's acceptable to have parallel SELECT only for INSERT, then we don't need to do anything. In that case it's maybe a bit wrong to use "PARALLEL DML" in the syntax. If we need to eventually support all DML types, then I guess we'd need to come up with some way to do EPQ. A couple random ideas about that: a) It's probably not feasible to do EPQ rechecks with parallel plans, at least not exactly the way we do them now. The EPQ injects tuples into various scan nodes, and it's unclear how to do that with parallel plans. The leader would need to inject tuples into workers, it's not clear which worker(s) should get the tuple, maybe the workers are not still running at that point, etc. The EPQ state would have to be shared with the workers somehow, etc. That seems like a substantial amount of work. b) Perhaps it would be possible to identify plans that are somehow "safe" to parallelize / materialize as proposed for INSERTs? Are there plans that won't need to inject tuples into the "subquery" part of the ModifyTable? The executor/README says In UPDATE/DELETE/MERGE, only the target relation needs to be handled this way. so what if the target relation is not in the subplan at all (or in some part of it)? Of course, I'm not sure common such queries are, but in my experience it's not uncommon to have a large ETL job that aggregates a large table then inserts/merges the result into a different one. Also, queries with UPDATE ... FROM ... do a join with the target table, and EPQ may need to rerun the join, so the materialization would need to happen only on one side of the join (I think). c) Another option I can think of is doing the EPQ in the leader, a bit as if the parallel plan runs with no workers. So maybe we could "force" it to do this for EPQ rechecks? I suppose it'd need to have a completely separate plan for this purpose, not sure of that's possible. d) Maybe the answer to parallel DML is "don't do EPQ"? There's an article explaining how CockroachDB does EPQ, or rather how it solved the problem without EPQ by retrying the DML. Would be a significant change in behavior, of course, and I'm sure it has other disadvantages. e) AFAIK this is a problem only in READ COMMITTED. In SERIALIZABLE the DML simply fails if there are concurrent modifications, and it's up to the application to retry the whole thing (which for the ETL jobs would almost never happen). In a way this is a version of (d) but with the retry performed by the application. But it means parallelizing DML would be simpler in SERIALIZABLE, i.e. a stricter serialization mode, which is a bit surprising (but also not entirely). Anyway, all of this is mostly just a brain dump. I don't have a clear plan, or even a full understanding of how EPQ works, and the trade offs. regards -- Tomas Vondra