Thread

Re: Transparent column encryption

Dave Cramer <davecramer@postgres.rocks> — 2025-08-23T10:20:26Z
On Thu, 18 Apr 2024 at 12:46, Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Apr 10, 2024 at 6:13 AM Peter Eisentraut <peter@eisentraut.org>
> wrote:
> > Obviously, it's early days, so there will be plenty of time to have
> > discussions on various other aspects of this patch.  I'm keeping a keen
> > eye on the discussion of protocol extensions, for example.
>
> I think the way that you handled that is clever, and along the lines
> of what I had in mind when I invented the _pq_ stuff.
>
> More specifically, the way that the ColumnEncryptionKey and
> ColumnMasterKey messages are handled is exactly the way that I was
> imagining things would work. The client uses _pq_.column_encryption to
> signal that it can understand those messages, and the server responds
> by including them. I assume that if the client doesn't signal
> understanding, then the server simply omits sending those messages. (I
> have not checked the code.)
>
> I'm less certain about the changes to the ParameterDescription and
> RowDescription messages. I see a couple of potential problems. One is
> that, if you say you can understand column encryption messages, the
> extra fields are included even for unencrypted columns. The client
> must choose at connection startup whether it ever wishes to read any
> encrypted data; if so, it pays a portion of that overhead all the
> time. Another potential problem is with the scalability of this
> design. Suppose that we could not only encrypt columns, but also
> compress, fold, mutilate, and spindle them. Then there might end up
> being a dizzying array of variation in the format of what is supposed
> to be the same message. Perhaps it's not so bad: as long as the
> documentation is clear about in which order the additional fields will
> appear in the relevant messages when more than one relevant feature is
> used, it's probably not too difficult for clients to cope. And it is
> probably also true that the precise size of, say, a RowDescription
> message will rarely be performance-critical. But another thought is
> that we might try to redesign this so that we simply add more message
> types rather than mutating message types i.e. after sending the
> RowDescription message, if any columns are encrypted, we additionally
> send a RowEncryptionDescription message. Then this treatment becomes
> symmetric with the handling of ColumnEncryptionKey and ColumnMasterKey
> messages, and there's no overhead when the feature is unused.
>
> With regard to the Bind message, I suggest that we regard the protocol
> change as reserving a currently-unused bit in the message to indicate
> whether the value is pre-encrypted, without reference to the protocol
> extension. It could be legal for a client that can't understand
> encryption message from the server to supply an encrypted value to be
> inserted into a column. And I don't think we would ever want the bit
> that's being reserved here to be used by some other extension for some
> other purpose, even when this extension isn't used. So I don't see a
> need for this to be tied into the protocol extension.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>
>
>
I just picked this thread up so apologies if this has already been
discussed.

Instead of sending the information about encrypted columns back in the
DESCRIBE message I have been contemplating returning that information back
in the PARSECOMPLETE message. I"ve thought about this for other things as
well. The JDBC driver has to do a round trip to describe after we parse a
named statement. Seems to me that returning the DESCRIBE immediately would
avoid this round trip.



Dave