Thread

  1. Fwd: Automating Failover Resync & Re-Attach in pgpool2

    VASUKI M <vasukim1992002@gmail.com> — 2025-10-23T07:13:04Z

    Hi Bo,
    
    Thank you very much for your clarification and the helpful links on
    follow_primary_command and auto_failback. I went through those sections in
    the documentation, and I now understand that Pgpool-II can automatically
    follow the new primary and reattach a standby node once it becomes
    available again.
    
    However, my idea was aimed at handling cases where the *old primary
    diverges in timeline or LSN* after a failover — for example, when the new
    primary executes additional writes before the old primary rejoins. In such
    cases, the existing auto-failback or follow-primary mechanisms can’t
    directly reattach the old node because its data is no longer in sync with
    the current primary.
    
    To address that, I was exploring a built-in *auto-resync enhancement* where
    Pgpool-II could internally perform the following before reattaching:
    
       1.
    
       *Detect timeline mismatch* between the new primary and the returning
       node.
       2.
    
       *Automatically run pg_rewind* (or WAL-based replay) to synchronize the
       old node’s data directory.
       3.
    
       *Restart and reattach the node* to the pool automatically once the
       resync is complete.
    
    This would essentially extend the existing auto_failback behavior to
    include *automated resynchronization*, reducing manual intervention and
    ensuring consistent cluster recovery even in timeline divergence scenarios.
    
    I’m thinking of something like a new configuration section in pgpool.conf:
    
    auto_resync = on
    resync_method = 'pg_rewind'
    resync_user = 'replicator'
    
    The feature could hook into the existing failback workflow (perhaps in
    failover.c or recovery.c), so that Pgpool performs resync + reattach
    seamlessly when the failed node returns.
    
    Would this be something the Pgpool team would consider as an enhancement?
    
    Thanks again for your time and guidance.
    
    Best regards,
    *Vasuki M*
    CDAC, Chennai
    vasukim1992002@gmail.com
    
    On Fri, 17 Oct 2025 at 13:30, Bo Peng <pengbo@sraoss.co.jp> wrote:
    
    > Hi,
    >
    > Thank you for your question.
    >
    > > While working with PostgreSQL failover scenarios, I noticed that the
    > process of re-attaching a standby node
    > > after a failover can be somewhat manual and prone to delays, especially
    > in production environments.
    >
    > After a failover, the standby nodes can be automatically attached to the
    > new primary by setting "follow_primary_command".
    >
    >
    > https://www.pgpool.net/docs/latest/en/html/runtime-config-failover.html#RUNTIME-CONFIG-FAILOVER-SETTINGS
    >
    > You can also automatically reattach a failed standby node by setting
    > "auto_failback = on".
    >
    >
    > https://www.pgpool.net/docs/latest/ja/html/runtime-config-failover.html#GUC-AUTO-FAILBACK
    >
    > ---
    > Bo Peng <pengbo@sraoss.co.jp>
    > SRA OSS K.K.
    > TEL: 03-5979-2701 FAX: 03-5979-2702
    > Mobile: 080-7752-0749
    > URL: https://www.sraoss.co.jp/
    >
    >
    > ________________________________________
    > 差出人: VASUKI M <vasukim1992002@gmail.com>
    > 送信: 2025 年 10 月 10 日 (金曜日) 21:17
    > 宛先: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
    > Cc: bharatdb@cdac.in <bharatdb@cdac.in>;
    > pgpool-general@lists.postgresql.org <pgpool-general@lists.postgresql.org>
    > 件名: Automating Failover Resync & Re-Attach in pgpool2
    >
    > Dear PostgreSQL and Pgpool Communities,While working with PostgreSQL
    > failover scenarios, I noticed that the process of re-attaching a standby
    > node after a failover can be somewhat manual and prone to delays,
    > especially in production environments.I explored automating this process
    > using a combination of pg_rewind and WAL replay, which allows a standby
    > node to resynchronize and re-attach to the primary automatically after a
    > failover. This could reduce downtime and simplify management of failover
    > nodes in high-availability setups.Automatically resynchronize after
    > failoverReduce downtime and ensure quicker recoveryMinimize manual
    > operations and errorsMaintain consistent cluster state with less
    > administrative overheadI believe that integrating such an automated resync
    > and re-attach feature into Pgpool-II could be very valuable for PostgreSQL
    > users, potentially as an enhancement in a future release.I wanted to share
    > this idea with the community to get feedback, suggestions, or any pointers
    > on existing work that may align with this. I am happy to contribute more
    > details