Thread

  1. Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)

    Hannu Krosing <hannuk@google.com> — 2025-07-08T18:54:33Z

    I still think we should go with direct toast tid pointers in varlena
    and not some kind of oid.
    
    It will remove the need for any oid management and also will be
    many-many orders of magnitude faster for large tables (just 2x faster
    for in-memory small tables)
    
    I plan to go over Michael's patch set here and see how much change is
    needed to add the "direct toast"
    
    My goals are:
    1. fast lookup from skipping index lookup
    2. making the toast pointer in main heap as small as possible -
    hopefully just the 6 bytes of tid pointer - so that scans that do not
    need toasted values get more tuples from each page
    3. adding all (optional) the extra data into toast chunk record as
    there we are free to add whatever is needed
    Currently I plan to introduces something like this for toast chunk record
    
    Column | Type | Storage
    -------------+---------+----------
    chunk_id | oid | plain | 0 when not using toast index, 0xfffe -
    non-deletable, for example when used as dictionary for multiple
    toasted values.
    chunk_seq | integer | plain | if not 0 when referenced from toast
    pointer then the toasted data starts at toast_pages[0] (or below it in
    that tree), which *must* have chunk_id = 0
    chunk_data | bytea | plain
    
    -- added fields
    
    toast_pages | tid[] | plain | can be chained or make up a tree
    offsets | int[] | plain | -- starting offsets of the toast_pages
    (octets or type-specific units), upper bit is used to indicate that a
    new compressed span starts at that offset, 2nd highest bit indicates
    that the page is another tree page
    comp_method | int | plain | -- compression methos used maybe should be enum ?
    dict_pages | tid[] | plain | -- pages to use as compression
    dictionary, up to N pages, one level
    
    This seems to be flexible enough to allow for both compressin and
    efficient partial updates
    
    ---
    Hannu
    
    
    On Tue, Jul 8, 2025 at 8:31 PM Nikita Malakhov <hukutoc@gmail.com> wrote:
    >
    > Hi!
    >
    > Greg, thanks for the interest in our work!
    >
    > Michael, one more thing forgot to mention yesterday -
    > #define TOAST_EXTERNAL_INFO_SIZE (VARTAG_ONDISK_OID + 1)
    > static const toast_external_info toast_external_infos[TOAST_EXTERNAL_INFO_SIZE]
    > VARTAG_ONDISK_OID historically has a value of 18
    > and here we got an array of 19 members with only 2 valid ones.
    >
    > What do you think about having an individual
    > TOAST value id counter per relation instead of using
    > a common one? I think this is a very promising approach,
    > but a decision must be made where it should be stored.
    >
    > --
    > Regards,
    > Nikita Malakhov
    > Postgres Professional
    > The Russian Postgres Company
    > https://postgrespro.ru/