Thread

  1. Re: Report bytes and transactions actually sent downtream

    Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> — 2025-12-18T12:52:40Z

    Hi Bertrand,
    
    On Wed, Dec 17, 2025 at 2:12 PM Bertrand Drouvot
    <bertranddrouvot.pg@gmail.com> wrote:
    >
    > What worries me is all those API changes:
    >
    > -typedef void (*LogicalDecodeChangeCB) (struct LogicalDecodingContext *ctx,
    > +typedef bool (*LogicalDecodeChangeCB) (struct LogicalDecodingContext *ctx,
    >
    > Those changes will break existing third party logical decoding plugin, even ones
    > that don't want the new statistics features.
    >
    > What about not changing those and just add a single new optional callback, say?
    >
    > typedef void (*LogicalDecodeReportStatsCB)(
    >     LogicalDecodingContext *ctx,
    >     ReorderBufferTXN *txn,
    >     bool *transaction_sent,
    >     size_t *bytes_filtered
    > );
    >
    > This way:
    >
    > - Existing plugins can still work without modification
    > - New or existing plugins can choose to provide statistics
    >
    
    I think that it will bring back the same problems that the previous
    design had or am I missing something? Let me elaborate:
    1. If every plugin implements the calculation of filtered_bytes
    differently, the same set of WAL passed through different output
    plugins would report different filtered bytes, even if they filtered
    the same changes. I think Andres wants minimal changes in the output
    plugins to avoid these divergences.
    2. This also has the problem that you had raised. What if an output
    plugin had calls to this callback in one version but removed them in
    the next.
    3. An output plugin may simply not realise that it can use this
    function to maintain statistics. Or The plugin may not call the
    function in all the places that it needs to. Or It may not realise it
    needs to call this function in a new callback added in the new
    PostgreSQL version. There are many ways an output plugin may get it
    wrong. I think this is also the reason Andres wants minimal changes
    output plugin to maintaining statistics.
    4. filteredBytes and sentTxns are not updated at the same place, so
    the plugins have to send one of those values as 0 always when calling
    the function. We need two functions one for each sentTxns and
    filteredBytes. That means more chances of error and divergence.
    
    The new implementation does not have these problems
    1. As the API is changed in the new implementation, every output
    plugin is forced to change their implementation. Amit and I discussed
    this aspect starting [1]. The plugins will detect the change when
    compiling their code against PG 19, so they won't miss it. The change
    expected from every plugin is minimal and well documented. They have
    to simply return true or false and rest will be taken care of by the
    core. So there is less chance of error or divergence.
    2. The plugin can not go back and forth on maintaining the statistics
    - an issue you raised. The API will force it to always return the
    required status.
    3. I think getting the correct statistics is more important than
    making it optional, especially when the changes expected from the
    plugin are simple. Thinking more about it, users wouldn't want to
    change their output plugin just because other output plugin supports
    statistics.
    
    Ideally, it would have been better if this was raised when Myself and
    Amit discussed this proposal [1], a month ago; before I spent time and
    effort implementing the design. But better now than before a commit.
    
    [1] https://www.postgresql.org/message-id/CAA4eK1K4Pq=acoXx3dEF7us_NFrDVU+M7f_j7KXm+Q2ywY+LSQ@mail.gmail.com
    
    
    
    
    --
    Best Wishes,
    Ashutosh Bapat