Re: BUG #16045: vacuum_db crash and illegal memory alloc after pg_upgrade from PG11 to PG12
Tomas Vondra <tomas.vondra@2ndquadrant.com>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Move into separate file all the SQL queries used in pg_upgrade tests
- 1924d508c335 10.20 landed
- 0e603b75c434 11.15 landed
- b6e525648d72 12.10 landed
- fae5f08e1719 13.6 landed
- b6dac98b0561 14.2 landed
- 0df9641d3905 15.0 landed
-
Add table to regression tests for binary-compatibility checks in pg_upgrade
- a9993416f80f 12.10 landed
- 755f04c72ef1 13.6 landed
- cf3d79aa31f2 14.2 landed
- 835bcba8b8d7 15.0 landed
-
Fix tests of pg_upgrade across different major versions
- afa09e4a9af6 12.9 landed
- 2a8dee6a67cc 13.5 landed
- f4e1c8892b9e 14.1 landed
- fa66b6dee084 15.0 landed
-
Multirange datatypes
- 6df7a9698bb0 14.0 cited
-
Work around cross-version-upgrade issues created by commit 9e38c2bb5.
- 97f73a978fc1 14.0 cited
-
Declare assorted array functions using anycompatible not anyelement.
- 9e38c2bb5093 14.0 cited
-
Remove factorial operators, leaving only the factorial() function.
- 76f412ab3105 14.0 cited
-
Create by default sql/ and expected/ for output directory in pg_regress
- e78900afd217 14.0 cited
-
Add missing include to pg_upgrade/version.c
- bc3a94dc0005 9.4.25 landed
- 984aa0ede1d2 9.5.20 landed
- e09ab32a2205 9.6.16 landed
-
Improve the check for pg_catalog.line data type in pg_upgrade
- 235a52ca0f26 9.4.25 landed
- f57b01dd75ee 9.5.20 landed
- 0a643de08715 9.6.16 landed
- 2218fdca496b 10.11 landed
- a970b6cdebd1 11.6 landed
- ebb4caa9120d 12.1 landed
- 8d48e6a7240c 13.0 landed
-
Improve the check for pg_catalog.unknown data type in pg_upgrade
- e86ece22114d 10.11 landed
- d071a2539ff4 11.6 landed
- a8e49ae0c381 12.1 landed
- a524f50d0fc6 13.0 landed
-
Check for tables with sql_identifier during pg_upgrade
- eaf900e842ab 12.1 landed
- 0ccfc2822366 13.0 landed
-
pg_upgrade: clarify the database names in error files
- 1634d361577a 13.0 cited
-
In the pg_upgrade test suite, don't write to src/test/regress.
- 40b132c1afbb 12.0 cited
-
Allow group access on PGDATA
- c37b3d08ca68 11.0 cited
-
Refactor dir/file permissions
- da9b580d8990 11.0 cited
-
Remove unused functions in regress.c.
- db3af9feb19f 11.0 cited
-
Make WAL segment size configurable at initdb time.
- fc49e24fa69a 11.0 cited
-
Fix bit-rot in pg_upgrade's test.sh, and improve documentation.
- 5bab1985dfc2 10.0 cited
On Tue, Oct 08, 2019 at 05:08:53PM +0000, PG Bug reporting form wrote:
>The following bug has been logged on the website:
>
>Bug reference: 16045
>Logged by: Hans Buschmann
>Email address: buschmann@nidsa.net
>PostgreSQL version: 12.0
>Operating system: Windows 10 64bit
>Description:
>
>I just did a pg_upgrade from pg 11.5 to pg 12.0 on my development machine
>under Windows 64bit (both distributions from EDB).
>
>cpsdb=# select version ();
> version
>------------------------------------------------------------
> PostgreSQL 12.0, compiled by Visual C++ build 1914, 64-bit
>(1 row)
>
>The pg_upgrade with --link went flawlessly, I started (only!) the new server
>12.0 and could connect and access individual databases.
>
>As recommended by the resulting analyze_new_cluster.bat I tried a full
>vacuumdb with:
>
>"N:/pgsql/bin/vacuumdb" -U postgres --all --analyze-only
>
>which crashed with
>vacuumdb: vacuuming database "cpsdb"
>vacuumdb: error: vacuuming of table "admin.q_tbl_archiv" in database "cpsdb"
>failed: ERROR: compressed data is corrupted
>
>I connected to the database through pgsql and looked at the table
>"admin.q_tbl_archiv"
>
>cpsdb=# \d+ q_tbl_archiv;
> Table
>"admin.q_tbl_archiv"
> Column | Type | Collation |
>Nullable | Default | Storage | Stats target | Description
>------------------+------------------------------------+-----------+----------+---------+----------+--------------+-------------
> table_name | information_schema.sql_identifier | |
> | | plain | |
> column_name | information_schema.sql_identifier | |
> | | plain | |
> ordinal_position | information_schema.cardinal_number | |
> | | plain | |
> col_qualifier | text | |
> | | extended | |
> id_column | information_schema.sql_identifier | |
> | | plain | |
> id_default | information_schema.character_data | |
> | | extended | |
>Access method: heap
>
>When trying to select * from q_tbl_archiv I got:
>
>cpsdb=# select * from q_tbl_archiv;
>ERROR: invalid memory alloc request size 18446744073709551613
>
>This table was created a long time back under 9.5 or 9.6 with the (here
>truncated) following command:
>
>
>create table q_tbl_archiv as
>with
>qseason as (
>select table_name,column_name, ordinal_position
>,replace(column_name,'_season','') as col_qualifier
>-- ,'id_'||replace(column_name,'_season','') as id_column
>from information_schema.columns
>where
>column_name like '%_season'
>and ordinal_position < 10
>and table_name in (
> 'table1'
>,'table2'
>-- here truncated:
>-- ... (here where all table of mine having columns like xxx_season)
>-- to reproduce, change to your own tablenames in a test database
>)
>order by table_name
>)
>select qs.*,c.column_name as id_column, c.column_default as id_default
>from
> qseason qs
> left join information_schema.columns c on c.table_name=qs.table_name and
>c.column_name like 'id_%'
>;
>
>Until now this table was always restored without error by migrating to a new
>major version through pg_dump/initdb/pr_restore.
>
>To verify the integrity of the table I restored the dump taken under pg_dump
>from pg 11.5 just before the pg_upgrade to another machine.
>
>The restore and analyze went OK and select * from q_tbl_archiv showed all
>tuples, eg (edited):
>
>cpsdb_dev=# select * from q_tbl_archiv;
> table_name | column_name | ordinal_position | col_qualifier
>| id_column | id_default
>--------------------------+--------------+------------------+---------------+-----------+----------------------------------------------------------
> table1 | chm_season | 2 | chm
>| |
> table2 | cs_season | 2 | cs
>| id_cs | nextval('table2_id_cs_seq'::regclass)
>...
>
>In conclusion, this seems to me like an error/omission of pg_upgrade.
>
There's clearly something bad happening. It's a bit strange, though. Had
this been a data corruption issue, I'd expect the pg_dump to fail too,
but it succeeds.
>It seems to handle these specially derived tables from information_schema
>not correctly, resulting in failures of the upgraded database.
>
Well, I don't see how that should make any difference. It's a CTAS and
that should create a regular table, that's not an issue. I wonder if
there were some changes to the data types involved, but that would be
essentially a break in on-disk format and we're careful about not doing
that ...
>For me, this error is not so crucial, because this table is only used for
>administrative purposes and can easily be restored from backup.
>
>But I want to share my findings for the sake of other users of pg_upgrade.
>
OK, thanks. Could you maybe set
log_error_verbosity = verbose
before invoking the vacuum (you can set that in that session)? That
should give us more details about where exactly the error is triggered.
Even better, if you could attach a debugger to the session, set
breakpoints on locations triggering 'invalid memory alloc request size'
and then show the backtrace (obviously, that's more complicated).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services