Thread

  1. Re: POC: make mxidoff 64 bits

    Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> — 2025-12-08T15:43:00Z

    On Sat, Dec 6, 2025 at 5:06 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
    > On 05/12/2025 15:42, Ashutosh Bapat wrote:
    
    >
    > > 007_multixact_conversion.pl fires thousands of queries through
    > > BackgroundPsql which prints debug output for each of the queries. When
    > > running this file with oldinstall set,
    > > 2.2M  regress_log_007_multixact_conversion (size of file)
    > > 77874 regress_log_007_multixact_conversion (wc -l output)
    > >
    > >
    > > Since this output is also copied in testlog.txt, the effect is two-fold.
    > >
    > >
    > > Most, if not all, of this output is useless. It also makes it hard to
    > > find the output we are looking for. PFA patch which reduces this
    > > output. The patch adds a flag verbose to query_safe() and query() to
    > > toggle this output. With the patch the sizes are
    > > 27K  regress_log_007_multixact_conversion
    > > 588 regress_log_007_multixact_conversion
    > >
    > >
    > > And it makes the test faster by about a second or two on my laptop.
    > > Something on those lines or other is required to reduce the output
    > > from query_safe().
    >
    > Nice! That log bloat was the reason I bundled together the "COMMIT;
    > BEGIN; SELECT ...;" steps into one statement in the loop. Your solution
    > addresses it more directly.
    
    Now we can call query_safe() separately on each of those. That will be
    more readable and marginally less code.
    
    >
    > I turned 'verbose' into a keyword parameter, for future extensibility of
    > those functions, so you now call it like "$node->query_safe("SELECT 1",
    > verbose => 0);". I also set "log_statements=none" in those connections,
    > to reduce the noise in the server log too.
    
    keyword parameter is better. also +1 for log_statements.
    
    >
    > > Some more comments
    > > +++ b/src/bin/pg_upgrade/multixact_old.c
    > >
    > >
    > > We may need to introduce new _new and then _old will become _older.
    > > Should we rename the files to have pre19 and post19 or some similar
    > > suffixes which make it clear what is meant by old and new?
    >
    > +1. I renamed multixact_old.c to multixact_pre_v19.c. And
    > multixact_new.c to multixact_rewrite.c. I also moved the
    > "convert_multixact" function that drives the conversion to
    > multixact_rewrite.c. The idea is that if in the future we change the
    > format again, we will have:
    >
    > multixact_pre_v19.c  # for reading -v19 files
    > multixact_pre_v24.c  # for reading v19-v23 files
    > multixact_rewrite.c  # for writing new files
    >
    > Hard to predict what a possible future format might look like and how
    > we'd want to organize the code then, though. This can be changed then if
    > needed, but it makes sense now.
    
    +1.
    
    >
    > > +static inline int64
    > > +MultiXactIdToOffsetPage(MultiXactId multi)
    > >
    > >
    > > The prologue mentions that the definitions are copy-pasted from
    > > multixact.c from version 18, but they share the names with functions
    > > in the current version. I think that's going to be a good source of
    > > confusion especially in a file which is a few hundred lines long. Can
    > > we rename them to have "Old" prefix or something similar?
    >
    > Fair. On the other hand, having the same names makes it easier to see
    > what the real differences with the server functions are. Not sure what's
    > best here..
    >
    > As long as we use the same names, it's important that
    > multixact_pre_v19.c doesn't #include the new definitions. I added some
    > comments on that, and also this safeguard:
    >
    > #define MultiXactOffset should_not_be_used
    >
    > That actually caught one (harmless) instance in the file where we had
    > not renamed MultiXactOffset to OldMultiXactOffset.
    >
    
    That looks useful, and has proved to be useful already.
    
    > I'm not entirely happy with the "Old" prefix here, because as you
    > pointed out, we might end up needing "older" or "oldold" in the future.
    > I couldn't come up with anything better though. "PreV19MultiXactOffset"
    > is quite a mouthful.
    
    How about MultiXactOffset32?
    
    Thanks for addressing rest of the comments.
    
    >
    > > +
    > > + note ">>> case #${tag}\n"
    > > +   . " oldnode mxoff from ${start_mxoff} to ${finish_mxoff}\n"
    > > +   . " newnode mxoff ${new_next_mxoff}\n";
    > >
    > >
    > > Should we check that some condition holds between finish_mxoff and
    > > new_next_mxoff?
    >
    > Got something in mind that we could check?
    >
    
    I have always seen that finish_mxoff is very high compared to newnode
    mxoff - given that we write only one member per mxid, is newnode mxoff
    going to be always something like 4K or so? Then we can check that
    value. But I will experiment more to see if I can come up with
    something, if possible.
    
    -- 
    Best Wishes,
    Ashutosh Bapat