Re: backup manifests and contemporaneous buildfarm failures

Robert Haas <robertmhaas@gmail.com>

From: Robert Haas <robertmhaas@gmail.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Fabien COELHO <coelho@cri.ensmp.fr>, Alvaro Herrera <alvherre@2ndquadrant.com>, David Steele <david@pgmasters.net>, Andres Freund <andres@anarazel.de>, Noah Misch <noah@leadboat.com>, Stephen Frost <sfrost@snowman.net>, Amit Kapila <amit.kapila16@gmail.com>, Suraj Kharage <suraj.kharage@enterprisedb.com>, tushar <tushar.ahuja@enterprisedb.com>, Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>, Rushabh Lathia <rushabh.lathia@gmail.com>, Tels <nospam-pg-abuse@bloodgate.com>, Andrew Dunstan <andrew.dunstan@2ndquadrant.com>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>, Jeevan Chalke <jeevan.chalke@enterprisedb.com>, vignesh C <vignesh21@gmail.com>, Thomas Munro <thomas.munro@gmail.com>
Date: 2020-04-04T13:36:08Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Try to avoid compiler warnings in optimized builds.

  2. Fix option related issues in pg_verifybackup.

  3. Add index term for backup manifest in documentation.

  4. Code review for backup manifest.

  5. Document the backup manifest file format.

  6. Fix typo in pg_validatebackup documentation.

  7. Exclude backup_manifest file that existed in database, from BASE_BACKUP.

  8. Msys2 tweaks for pg_validatebackup corruption test

  9. Fix resource management bug with replication=database.

  10. Be more careful about time_t vs. pg_time_t in basebackup.c.

  11. pg_validatebackup: Fix 'make clean' to remove tmp_check.

  12. pg_validatebackup: Also use perl2host in TAP tests.

  13. Generate backup manifests for base backups, and validate them.

  14. Add checksum helper functions.

  15. pg_waldump: Add a --quiet option.

  16. Catversion bump for b9b408c48724

  17. pg_basebackup: Refactor code for reading COPY and tar data.

  18. Use a ResourceOwner to track buffer pins in all cases.

  19. Use ARMv8 CRC instructions where available.

  20. Logical replication support for initial data copy

  21. Use Intel SSE 4.2 CRC instructions where available.

  22. Switch to CRC-32C in WAL and other places.

  23. Remove support for 64-bit CRC.

  24. Change CRCs in WAL records from 64bit to 32bit for performance reasons.

On Fri, Apr 3, 2020 at 10:43 PM Robert Haas <robertmhaas@gmail.com> wrote:
> I think I've done about as much as I can do for tonight, though. Most
> things are green now, and the ones that aren't are failing because of
> stuff that is at least plausibly fixed. By morning it should be
> clearer how much broken stuff is left, although that will be somewhat
> complicated by at least sidewinder and seawasp needing manual
> intervention to get back on track.

Taking stock of the situation this morning, most of the buildfarm is
now green. There are three failures, on eelpout (6 hours ago),
fairywren (17 hours ago), and hyrax (3 days, 7 hours ago).

eelpout is unhappy because:

+WARNING:  could not remove shared memory segment
"/PostgreSQL.248989127": No such file or directory
+WARNING:  could not remove shared memory segment
"/PostgreSQL.1450751626": No such file or directory
  multibatch
 ------------
  f
@@ -861,22 +863,15 @@

 select length(max(s.t))
 from wide left join (select id, coalesce(t, '') || '' as t from wide)
s using (id);
- length
---------
- 320000
-(1 row)
-
+ERROR:  could not open shared memory segment "/PostgreSQL.605707657":
No such file or directory
+CONTEXT:  parallel worker

I'm not sure what caused that exactly, but it sorta looks like
operator intervention. Thomas, any ideas?

fairywren's last run was on 21dc488, and commit
460314db08e8688e1a54a0a26657941e058e45c5 was an attempt to fix what
broken there. I guess we'll find out whether that worked the next time
it runs.

hyrax's last run was before any of this happened, so it seems to have
an unrelated problem. The last two runs, three and six days ago, both
failed like this:

-ERROR:  stack depth limit exceeded
+ERROR:  stack depth limit exceeded at character 8

Not sure what that's about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company