Thread

  1. Re: Optimize LISTEN/NOTIFY

    Joel Jacobson <joel@compiler.org> — 2025-11-23T15:49:26Z

    On Sat, Nov 22, 2025, at 22:30, Joel Jacobson wrote:
    > On Thu, Nov 20, 2025, at 21:26, Tom Lane wrote:
    >> I took a brief look through the v28 patch, and I'm fairly distressed
    >> at how much new logic has been stuffed into what's effectively a
    >> critical section.  It's totally not okay for AtCommit_Notify or
    >> anything it calls to throw an error
    >
    > Right, I agree. Thanks for guidance.
    >
    >> So I think there needs to be a serious effort made to move as
    >> much as we possibly can of the potentially-risky stuff into
    >> PreCommit_Notify.  In particular I think we ought to create
    >> the shared channel hash entry then, and even insert our PID
    >> into it.  We could expand the listenersArray entries to include
    >> both a PID and a boolean "is it REALLY listening?", and then
    >> during Exec_ListenCommit we'd only be required to find an
    >> entry we already added and set its boolean, so there's no OOM
    >> hazard.  Possibly do something similar with the local
    >> listenChannelsHash, so as to remove that possibility of OOM
    >> failure as well.
    >
    > Thanks for the idea, I like this approach. I've expanded the
    > listenersArray like suggested, and moved all risky stuff from
    > Exec_ListenCommit to PreCommit_Notify.
    >
    >> (An alternative design could be to go ahead and insert our
    >> PID during PreCommit_Notify, and just tolerate the small
    >> risk of getting signaled when we didn't need to be.  But
    >> then we'd need some mechanism for cleaning out the bogus
    >> entry during AtAbort_Notify.)
    >
    > We seem to need a cleanup mechanism also with the boolean field design,
    > since a channel could end up being added only to listenChannelsHash, but
    > not channelHash, which would trick IsListeningOn() into falsely thinking
    > we're listening on such channel when we're not. This could happen if
    > successfully adding the channel to listenChannelsHash, but OOM when
    > trying to add it to channelHash.
    >
    > AtAbort_Notify now handles such half-state, by reconciling all channels
    > that had LISTEN_LISTEN actions, using the channelHash as the single
    > source of truth, removing channels from both listenChannelsHash and
    > channelHash, unless the active field is true (which means we were
    > already listening to the channel due to a previous transaction).
    >
    > I've tested triggering the cleanup logic by adding elog ERROR that
    > triggered after listenChannelsHash insert, and another test that
    > triggered after channelHash insert, and it cleaned it up correctly. I
    > haven't created a test for it in tree though, would we want that?
    >
    >> I'm not sure what I think about the new logic in SignalBackends
    >> from this standpoint.  Making it very-low-probability-of-error
    >> definitely needs some consideration though.
    >
    > The initChannelHash call in SignalBackends is now gone, moved to
    > PreCommit_Notify (called if there are any pendingNotifies).
    >
    > I also took the liberty of fixing the XXX comment, to lazily preallocate
    > the signals arrays during PreCommit_Notify. It felt too inconsistent to
    > just leave that unfixed, but maybe should be a separate commit?
    
    I've extracted the preallocation of signals arrays into a separate patch:
    https://commitfest.postgresql.org/patch/6248/
    
    > I wonder how risky the remaining new logic in SignalBackends is. For
    > instance, I looked at dshash_find(..., false), and note it calls
    > LWLockAcquire which in turn could elog ERROR if num locks is exceeded,
    > but in master we're already calling LWLockAcquire in SignalBackends, so
    > should be fine I guess?
    >
    > Apart from that, I don't see any new logic in SignalBackends, that could
    > potentially be risky.
    
    /Joel