Thread

  1. Re: Report bytes and transactions actually sent downtream

    Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> — 2025-12-18T12:52:52Z

    On Thu, Dec 18, 2025 at 7:56 AM Chao Li <li.evan.chao@gmail.com> wrote:
    >
    >
    >
    > > On Dec 17, 2025, at 13:55, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
    > >
    > > Thanks for pointing this out. I have fixed it my code. However, at
    > > this point I am looking for a design review, especially to verify that
    > > the new implementation addresses Andres's concern raised in [1] while
    > > not introducing any design issues raised earlier e.g. those raised in
    > > threads [2], [3] and [4]
    > >
    > > [1] https://www.postgresql.org/message-id/zzidfgaowvlv4opptrcdlw57vmulnh7gnes4aerl6u35mirelm@tj2vzseptkjk
    > >>> [2] https://www.postgresql.org/message-id/CAA4eK1KzYaq9dcaa20Pv44ewomUPj_PbbeLfEnvzuXYMZtNw0A%40mail.gmail.com
    > >>> [3] https://www.postgresql.org/message-id/aNZ1T5vYC1BtKs4M@ip-10-97-1-34.eu-west-3.compute.internal
    > >>> [4] https://www.postgresql.org/message-id/CAExHW5tfVHABuv1moL_shp7oPrWmg8ha7T8CqwZxiMrKror7iw%40mail.gmail.com
    > >
    > > --
    > > Best Wishes,
    > > Ashutosh Bapat
    >
    >
    > Hi Ashutosh,
    >
    > Yeah, I owe you a review. I committed to review this patch but I forgot, sorry about that.
    >
    > From design perspective, I agree increasing counters should belong to the core, plugin should return properly values following the contract. And I got some more comments:
    >
    > 1. I just feel a bool return value might not be clear enough. For example:
    >
    > ```
    > -       ctx->callbacks.change_cb(ctx, txn, relation, change);
    > +       if (!ctx->callbacks.change_cb(ctx, txn, relation, change))
    > +               cache->filteredBytes += ReorderBufferChangeSize(change);
    > ```
    >
    > You increase filteredBytes when change_cb returns false. But if we look at pgoutput_change(), there are many reasons to return false. Counting all the cases to filteredBytes seems wrong.
    
    I am not able to understand this. Every "return false" from
    pgoutput_change() indicates that the change was filtered out and hence
    the size of corresponding change is being added to filteredBytes by
    the caller. Which "return false" does not indicate a filtered out
    change?
    
    >
    > 2.
    > ```
    > -       ctx->callbacks.truncate_cb(ctx, txn, nrelations, relations, change);
    > +       if (!ctx->callbacks.truncate_cb(ctx, txn, nrelations, relations, change))
    > +               cache->filteredBytes += ReorderBufferChangeSize(change);
    > ```
    >
    > Row filter doesn’t impact TRUNCATE, why increase filteredBytes after truncate_cb()?
    
    A TRUNCATE of a relation which is not part of the publication will be
    filtered out.
    
    >
    > 3.
    > ```
    > -       ctx->callbacks.prepare_cb(ctx, txn, prepare_lsn);
    > +       if (ctx->callbacks.prepare_cb(ctx, txn, prepare_lsn))
    > +               cache->sentTxns++;
    > ```
    >
    > For 2-phase commit, it increase sentTxns after prepare_cb, and
    > ```
    > +       if (ctx->callbacks.stream_abort_cb(ctx, txn, abort_lsn))
    > +               cache->sentTxns++;
    > ```
    >
    > If the transaction is aborted, sentTxns is increased again, which is confusing. Though for aborting there is some data (a notification) is streamed, but I don’t think that should be counted as a transaction.
    >
    > After commit, sentTxns is also increased, so that, a 2-phase commit is counted as two transactions, which feels also confusing. IMO, a 2-phase commit should still be counted as one transaction.
    
    stream_commit/abort_cb is called after stream_prepare_cb not after prepare_cb.
    
    >
    > 4. You add sentBytes and filteredBytes. I am thinking if it makes sense to also add sentRows and filteredRows. Because tables could be big or small, bytes + rows could show a more clear picture to users.
    
    We don't have corresponding total_rows and streamed_rows counts. I
    think that's because we haven't come across a use case for them. Do
    you have a use case in mind?
    
    --
    Best Wishes,
    Ashutosh Bapat