Re: Extensible storage manager API - SMGR hook Redux

Andreas Karlsson <andreas@proxel.se>

From: Andreas Karlsson <andreas@proxel.se>
To: Tristan Partin <tristan@neon.tech>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Cc: Matthias van de Meent <boekewurm+postgres@gmail.com>, Heikki Linnakangas <hlinnaka@iki.fi>, Andres Freund <andres@anarazel.de>, zsolt.parragi@cancellar.hu, nitinjadhavpostgres@gmail.com, gongxun0928@gmail.com
Date: 2025-02-03T11:27:23Z
Lists: pgsql-hackers

Attachments

Hi!

We at Percona are very interested in this patch for our transparent data 
encryption extension. So we would love to collaborate with you, and 
anyone else interested, on making the SMGR extensible.

I have attached rebased and a bit cleaned up versions of Tristan's 
patches plus a couple of patches we have been working on in-house 
(mainly my colleague Zsolt). I also have some questions which I would 
like to discuss.

0001-0004

The same patches as Tristan posted but rebased and cleaned up a bit to 
better follow the code style. I also removed a couple of dead variables 
which seemed like left overs.

0005

Since we support having both encrypted and unencrypted relations we use 
the RelFileLocator to look up if a relation is encrypted. And to 
preserve that information when smgrcreate() creates a new relfile for a 
relation we pass along the old RelFileLocator.

For our use case it is possible that we could solve this in other ways. 
For example if we decide to go with configuring the SMGR per schema this 
will probably not be necessary at all.

0006

The patch introduces the concept of "chaining" SMGRs where we have tail 
(e.g. md or a theoretical Ceph SMGR) and modifier (e.g. TDE or the 
fsync_checker). Something like this would be useful for our case since 
it would be nice to be able to use the same encryption code for md and 
for some other potential replacement for md which uses some kind object 
storage for example.

As a bonus this allowed us to make the functions implementing md static.

It is currently controlled via a GUC, smgr_chain, but this will of 
course depend on how we decide to implement configuring which SMGR to use.

Questions

- What is up with the barrier when loading SMGRs? That does not seem 
necessary or am I missing something? I believe Andres also spotted this.

- How should we configure which SMGR to use for each relation? People 
have talked about doing it per tablespace or using hooks and we have a 
patch which uses a GUC for this. I have personally not researched these 
options enough to have an opinion yet.

- Is our idea about chaining SMGRs useful? In its current form or some 
variant inspired by it?

- We need to benchmark this to make sure we do not introduce too much 
overhead, especially for people who just want to use md. I saw for 
example that Andres had some complaint about extra indirection which we 
may have to address.

Andreas