Re: In-placre persistance change of a relation

Heikki Linnakangas <hlinnaka@iki.fi>

From: Heikki Linnakangas <hlinnaka@iki.fi>
To: Kyotaro Horiguchi <horikyota.ntt@gmail.com>, michael@paquier.xyz
Cc: nathandbossart@gmail.com, postgres@jeltef.nl, smithpb2250@gmail.com, vignesh21@gmail.com, jakub.wartak@enterprisedb.com, stark.cfm@gmail.com, barwick@gmail.com, jchampion@timescale.com, pryzby@telsasoft.com, tgl@sss.pgh.pa.us, rjuju123@gmail.com, jakub.wartak@tomtom.com, pgsql-hackers@lists.postgresql.org, bharath.rupireddyforpostgres@gmail.com
Date: 2024-09-01T19:15:00Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. pg_dump: Refactor getIndexes()

  2. Optimize DropRelFileNodesAllBuffers() for recovery.

On 31/08/2024 19:09, Kyotaro Horiguchi wrote:
> - UNDO log(0002): This handles file deletion during transaction aborts,
>    which was previously managed, in part, by the commit XLOG record at
>    the end of a transaction.
> 
> - Prevent orphan files after a crash (0005): This is another use-case
>    of the UNDO log system.

Nice, I'm very excited if we can fix that long-standing issue! I'll try 
to review this properly later, but at a quick 5 minute glance, one thing 
caught my eye:

This requires fsync()ing the per-xid undo log file every time a relation 
is created. I fear that can be a pretty big performance hit for 
workloads that repeatedly create and drop small tables. Especially if 
they're otherwise running with synchronous_commit=off. Instead of 
flushing the undo log file after every write, I'd suggest WAL-logging 
the undo log like regular relations and SLRUs. So before writing the 
entry to the undo log, WAL-log it. And with a little more effort, you 
could postpone creating the files altogether until a checkpoint happens, 
similar to how twophase state files are checkpointed nowadays.

I wonder if the twophase state files and undo log files should be merged 
into one file. They're similar in many ways: there's one file per 
transaction, named using the XID. I haven't thought this fully through, 
just a thought..

> +static void
> +undolog_set_filename(char *buf, TransactionId xid)
> +{
> +	snprintf(buf, MAXPGPATH, "%s/%08x", SIMPLE_UNDOLOG_DIR, xid);
> +}

I'd suggest using FullTransactionId. Doesn't matter much, but seems like 
a good future-proofing.

-- 
Heikki Linnakangas
Neon (https://neon.tech)