Re: BUG #16045: vacuum_db crash and illegal memory alloc after pg_upgrade from PG11 to PG12

Tomas Vondra <tomas.vondra@2ndquadrant.com>

From: Tomas Vondra <tomas.vondra@2ndquadrant.com>
To: buschmann@nidsa.net, pgsql-bugs@lists.postgresql.org
Date: 2019-10-09T23:07:59Z
Lists: pgsql-bugs, pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Move into separate file all the SQL queries used in pg_upgrade tests

  2. Add table to regression tests for binary-compatibility checks in pg_upgrade

  3. Fix tests of pg_upgrade across different major versions

  4. Multirange datatypes

  5. Work around cross-version-upgrade issues created by commit 9e38c2bb5.

  6. Declare assorted array functions using anycompatible not anyelement.

  7. Remove factorial operators, leaving only the factorial() function.

  8. Create by default sql/ and expected/ for output directory in pg_regress

  9. Add missing include to pg_upgrade/version.c

  10. Improve the check for pg_catalog.line data type in pg_upgrade

  11. Improve the check for pg_catalog.unknown data type in pg_upgrade

  12. Check for tables with sql_identifier during pg_upgrade

  13. pg_upgrade: clarify the database names in error files

  14. In the pg_upgrade test suite, don't write to src/test/regress.

  15. Allow group access on PGDATA

  16. Refactor dir/file permissions

  17. Remove unused functions in regress.c.

  18. Make WAL segment size configurable at initdb time.

  19. Fix bit-rot in pg_upgrade's test.sh, and improve documentation.

Well, I think I found the root cause. It's because of 7c15cef86d, which
changed the definition of sql_identifier so that it's a domain over name
instead of varchar. So we now have this:

  SELECT typname, typlen FROM pg_type WHERE typname = 'sql_identifier':

  -[ RECORD 1 ]--+---------------
  typname        | sql_identifier
  typlen         | -1

instead of this

  -[ RECORD 1 ]--+---------------
  typname        | sql_identifier
  typlen         | 64

Unfortunately, that seems very much like a break of on-disk format, and
after pg_upgrade any table containing sql_identifier columns is pretty
much guaranteed to be badly mangled. For example, the first row from the
table used in the original report looks like this on PostgreSQL 11:

  test=# select ctid, * from q_tbl_archiv limit 1;
  -[ RECORD 1 ]----+--------------------------
  ctid             | (0,1)
  table_name       | _pg_foreign_data_wrappers
  column_name      | foreign_data_wrapper_name
  ordinal_position | 5
  col_qualifier    | foreign_data_wrapper_name
  id_column        | 
  id_default       | 

while on PostgreSQL 12 after pg_upgrade it looks like this

  test=# select ctid, table_name, column_name, ordinal_position from q_tbl_archiv limit 1;:
  -[ RECORD 1 ]----+---------------------------------------------------------
  ctid             | (0,1)
  table_name       | 5_pg_foreign_data_wrappers5foreign_data_wrapper_name\x05
  column_name      | _data_wrapper_name
  ordinal_position | 0

Not sure what to do about this :-(


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services