Re: pg_upgrade test for binary compatibility of core data types

Justin Pryzby <pryzby@telsasoft.com>

From: Justin Pryzby <pryzby@telsasoft.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Andrew Dunstan <andrew@dunslane.net>, Bruce Momjian <bruce@momjian.us>, Peter Eisentraut <peter.eisentraut@enterprisedb.com>, Tomas Vondra <tomas.vondra@2ndquadrant.com>, Andres Freund <andres@anarazel.de>, buschmann@nidsa.net, pgsql-hackers@lists.postgresql.org, Noah Misch <noah@leadboat.com>
Date: 2021-04-30T18:33:48Z
Lists: pgsql-bugs, pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Move into separate file all the SQL queries used in pg_upgrade tests

  2. Add table to regression tests for binary-compatibility checks in pg_upgrade

  3. Fix tests of pg_upgrade across different major versions

  4. Multirange datatypes

  5. Work around cross-version-upgrade issues created by commit 9e38c2bb5.

  6. Declare assorted array functions using anycompatible not anyelement.

  7. Remove factorial operators, leaving only the factorial() function.

  8. Create by default sql/ and expected/ for output directory in pg_regress

  9. Add missing include to pg_upgrade/version.c

  10. Improve the check for pg_catalog.line data type in pg_upgrade

  11. Improve the check for pg_catalog.unknown data type in pg_upgrade

  12. Check for tables with sql_identifier during pg_upgrade

  13. pg_upgrade: clarify the database names in error files

  14. In the pg_upgrade test suite, don't write to src/test/regress.

  15. Allow group access on PGDATA

  16. Refactor dir/file permissions

  17. Remove unused functions in regress.c.

  18. Make WAL segment size configurable at initdb time.

  19. Fix bit-rot in pg_upgrade's test.sh, and improve documentation.

Attachments

On Sat, Mar 06, 2021 at 03:01:43PM -0500, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> > On 2021-01-12 22:44, Andrew Dunstan wrote:
> >> Cross version pg_upgrade is tested regularly in the buildfarm, but not
> >> using test.sh. Instead it uses the saved data repository from a previous
> >> run of the buildfarm client for the source branch, and tries to upgrade
> >> that to the target branch.
> 
> > Does it maintain a set of fixups similar to what is in test.sh?  Are 
> > those two sets the same?
> 
> Responding to Peter: the first answer is yes, the second is I didn't
> check, but certainly Justin's patch makes them closer.

Right - I had meant to send this.

https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm

                $opsql = 'drop operator if exists public.=> (bigint, NONE)';
..
                my $missing_funcs = q{drop function if exists public.boxarea(box);
                              drop function if exists public.funny_dup17();
..
                my $prstmt = join(';',
                        'drop operator if exists #@# (bigint,NONE)',
                        'drop operator if exists #%# (bigint,NONE)',
                        'drop operator if exists !=- (bigint,NONE)',
..
                        $prstmt = join(';',
                                'drop operator @#@ (NONE, bigint)',
..
                                'drop aggregate if exists public.array_cat_accum(anyarray)',

> I spent some time poking through this set of patches.  I agree that
> there's problem(s) here that we need to solve, but it feels like this
> isn't a great way to solve them.  What I see in the patchset is:

For starters, is there a "release beta checklist" ?
Testing test.sh should be on it.
So should fuzz testing.

> v4-0001 mostly teaches test.sh about specific changes that have to be
> made to historic versions of the regression database to allow them
> to be reloaded into current servers.  As already discussed, this is
> really duplicative of knowledge that's been embedded into the buildfarm
> client over time.  It'd be better if we could refactor that so that
> the buildfarm shares a common database of these actions with test.sh.
> And said database ought to be in our git tree, so committers could
> fix problems without having to get Andrew involved every time.
> I think this could be represented as a psql script, at least in
> versions that have psql \if (but that came in in v10, so maybe
> we're there already).

I started this.  I don't know if it's compatible with the buildfarm client, but
I think any issues maybe can be avoided by using "IF EXISTS".

> v4-0002 is a bunch of random changes that mostly seem to revert hacky
> adjustments previously made to improve test coverage.  I don't really
> agree with any of these, nor see why they're necessary.  If they
> are necessary then we need to restore the coverage somewhere else.
> Admittedly, the previous changes were a bit hacky, but deleting them
> (without even bothering to adjust the relevant comments) isn't the
> answer.

It was necessary to avoid --wal-segsize and -g to allow testing upgrades from
versions which don't support those options.  I think test.sh should be portable
back to all supported versions.

When those options were added, it broke test.sh upgrading from old versions.
I changed this to a shell conditional for the "new" features:
| "$1" -N -A trust ${oldsrc:+--wal-segsize 1 -g}
Ideally it would check the version.

> v4-0003 is really the heart of the matter: it adds a table with some
> previously-not-covered datatypes plus a query that purports to make sure
> that we are covering all types of interest.

Actually the 'manytypes' table intends to include *all* core datatypes itself,
not just those that aren't included somewhere else.  I think "included
somewhere else" depends on the order of the regression these, and type_sanity
runs early, so the table might need to include many types that are created
later, to avoid "false positives" in the associated test.

> But I'm not sure I believe
> that query.  It's got hard-wired assumptions about which typtype values
> need to be covered.  Why is it okay to exclude range and multirange?
> Are we sure that all composites are okay to exclude?  Likewise, the
> restriction to pg_catalog and information_schema schemas seems likely to
> bite us someday.  There are some very random exclusions based on name
> patterns, which seem unsafe (let's list the specific type OIDs), and
> again the nearby comments don't match the code.  But the biggest issue
> is that this can only cover core datatypes, not any contrib stuff.

I changed to use regtype/OIDs, included range/multirange and stopped including
only pg_catalog/information_schema.  But didn't yet handle composites.

> I don't know what we could do about contrib types.  Maybe we should
> figure that covering core types is already a step forward, and be
> happy with getting that done.

Right .. this is meant to at least handle the lowest hanging fruit.

-- 
Justin