Thread

  1. Re: Refactor code around GUC default_toast_compression

    Ayush Tiwari <ayushtiwari.slg01@gmail.com> — 2026-05-01T09:14:07Z

    Hi,
    
    
    On Fri, 1 May 2026 at 13:21, Michael Paquier <michael@paquier.xyz> wrote:
    
    > Hi all,
    >
    > While hacking on the TOAST code, I have been annoyed more than once
    > with the following piece in toast_compression.h:
    > /*
    >  * Built-in compression method ID.  The toast compression header will store
    >  * this in the first 2 bits of the raw length.  These built-in compression
    >  * method IDs are directly mapped to the built-in compression methods.
    >  *
    >  * Don't use these values for anything other than understanding the meaning
    >  * of the raw bits from a varlena; in particular, if the goal is to
    > identify
    >  * a compression method, use the constants TOAST_PGLZ_COMPRESSION, etc.
    >  * below. We might someday support more than 4 compression methods, but
    >  * we can never have more than 4 values in this enum, because there are
    >  * only 2 bits available in the places where this is stored.
    >  */
    > typedef enum ToastCompressionId
    > {
    >         TOAST_PGLZ_COMPRESSION_ID = 0,
    >         TOAST_LZ4_COMPRESSION_ID = 1,
    >         TOAST_INVALID_COMPRESSION_ID = 2,
    > } ToastCompressionId;
    >
    > This is due the fact that we have only two bits that can be used in
    > va_tcinfo or va_extinfo.  While looking at the addition of a new
    > compression method, this was causing a mess, so I have hacked the
    > attached patch, that makes the addition of more compression methods
    > easier.  The idea is centralized in toast_compression.c, with the
    > addition of a registry that knows about all the TOAST compression
    > methods and its meta-data:
    > - name
    > - GUC enum values.
    > - attcompression char value.
    > - varatt on-disk value.
    >
    >
    I looked at the patch, and the refactoring direction looks reasonable
    to me.  I noticed a few small things worth cleaning up (although
    it is for v20, just wanted to drop it here for future)
    
    1. The patch includes an unrelated hunk in
       doc/src/sgml/ref/alter_index.sgml, adding text about `ALTER INDEX ...
       ATTACH PARTITION`.  That looks like an accidental carry-over from another
       patch and shouldn't be there ig.
    
    2. The comment in src/include/access/toast_compression.h describing
       default_toast_compression looks stale after this change?  It still says
       that the GUC value is one of the char values stored in
       pg_attribute.attcompression, but the patch changes it to use the new
       ToastCompressionGucValue enum values instead.(Maybe I'm
       missing something)
    
    3. One minor point: CompressionIdToMethod() seems to be added as a public
       helper, but I could not find any callers in this patch.  Also,
       pg_column_compression() still keeps its own cmid-to-name switch.  If the
       intent is to centralize these mappings in the registry, perhaps that code
       could use the new helper path as well, otherwise the unused helper may
    not
       be necessary yet (though it might be in future).
    
    Thanks for the patch!
    
    Regards,
    Ayush