Re: Parallel Apply
Amit Kapila <amit.kapila16@gmail.com>
From: Amit Kapila <amit.kapila16@gmail.com>
To: Abhi Mehta <abhi15.mehta@gmail.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-09-16T09:51:19Z
Lists: pgsql-hackers
On Sat, Sep 13, 2025 at 9:49 PM Abhi Mehta <abhi15.mehta@gmail.com> wrote: > > Hi Amit, > > > Really interesting proposal! I've been thinking through some of the implementation challenges: > > > On the memory side: That hash table tracking RelationId and ReplicaIdentity could get pretty hefty under load. Maybe bloom filters could help with the initial screening? Also wondering > > about size caps with some kind of LRU cleanup when things get tight. > Yeah, this is an interesting thought and we should test, if we really hit this case and if we could improve it with your suggestion. > > Worker bottleneck: This is the tricky part - hundreds of active transactions but only a handful of workers. Seems like we'll hit serialization anyway when workers are maxed out. What > > about spawning workers dynamically (within limits) or having some smart queuing for when we're worker-starved? > Yeah, we would have a GUC or subscription-option max parallel workers. We can consider smart-queuing or any advanced techniques for such cases after the first version is committed as making that work in itself is a big undertaking. > > > Alternative approach(if it can be consider): Rather than full parallelization, break transaction processing into overlapping stages: > > > • Stage 1: Parse WAL records > Hmm, this is already performed by the publisher. > • Stage 2: Analyze dependencies > > • Stage 3: Execute changes > > • Stage 4: Commit and track progress > > > This creates a pipeline where Transaction A executes changes while Transaction B analyzes dependencies > I don't know how to make this work in the current framework of apply. But feel free to propose this with some more details as to how it will work? -- With Regards, Amit Kapila.