Thread

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

Dharin Shah <dharinshah95@gmail.com> — 2025-12-24T00:47:16Z
Hello,

Following up on my earlier patch submission, I've reworked the zstd TOAST
compression implementation based on our discussion here. The new patch now
avoids the 20-byte extended header.

Current Approach
- New `VARTAG_ONDISK_ZSTD` (value 19) for ZSTD external storage
- Maintains existing 16-byte varatt_external structure
- ZSTD external-only (no inline compression)

Note: Using a dedicated VARTAG_ONDISK_ZSTD keeps the on-disk TOAST pointer
payload at 16 bytes, but it is not a general extensible metadata carrier.
If PostgreSQL later adopts a more general extensible TOAST framework, this
change should not block it; VARTAG_ONDISK_ZSTD would remain as a supported
legacy encoding, while new toasted values could be written using the newer
framework and old values rewritten via normal table rewrites.

Storage (170 MB uncompressed):
    ZSTD: 22 MB (7.60x) - 38.7% space savings vs LZ4
    PGLZ: 36 MB (4.76x)
    LZ4:  36 MB (4.66x)

Key findings:
- Large values (>50KB): ZSTD 33% better compression than PGLZ (~30% better
than LZ4)
- Low-entropy data: ZSTD compresses what LZ77 methods cannot
- Small values: ZSTD pays external overhead vs inline PGLZ/LZ4
While ZSTD uses slightly less space overall, the external storage mechanism
incurs a TOAST fetch overhead for small values, potentially impacting
performance.
Backwards Compatibility Tests
- Mixed compression: Rows with PGLZ, LZ4, and ZSTD coexist and decompress
correctly
- Lazy recompression: ALTER COLUMN ... SET COMPRESSION zstd affects new
data; existing data is lazily recompressed upon UPDATE or VACUUM FULL.
- Inline vs external: Small values remain inline; large values use
appropriate external compression.
Data integrity: All data decompresses correctly across all methods.

Trade-offs and Design Considerations

- External-only avoids consuming cmid=3 and extended header complexity

- Slice access: no ZSTD-specific optimization (follow-up area)

- Hybrid inline/external for small values: not in this patch (feedback
welcome)

Reviewer Questions - Is vartag-based external-only acceptable?
- Should compression level (currently 3) be configurable? - Is the external
storage overhead for small values acceptable, or is hybrid inline/external
behavior needed?
Thanks, Dharin

On Thu, Dec 18, 2025 at 11:44 PM Michael Paquier <michael@paquier.xyz>
wrote:

> On Thu, Dec 18, 2025 at 10:44:22PM +0100, Dharin Shah wrote:
> > I want to make sure I understand your main point: you're OK with a new
> > `vartag_external`, but prefer we avoid increasing the heap TOAST pointer
> > from 16 -> 20 bytes since every zstd-toasted value would pay +4 bytes in
> > the main heap tuple.
>
> That would be my choice, yes.  Not sure about the opinion of others on
> this matter.
>
> > I also realize the "compatibility" of the extended header doesn't buy us
> > much — we'll need to support the existing 16-byte varatt_external forever
> > for backward compatibility. Adding a 20-byte structure just means two
> > formats to maintain indefinitely.
>
> Yes.  Patches have to maintain on-disk compatibility.
>
> > A couple clarifying questions if we go with new vartag (e.g.,
> > `VARTAG_ONDISK_ZSTD`), same 16-byte `varatt_external` payload, vartag as
> > discriminator
> > 1. How should we handle future methods beyond zstd? One tag per method,
> or
> > store a method id elsewhere (e.g., in TOAST chunk header)?
>
> My suspicion would be that we could either use a new set of vartags in
> the future for each compression method.  When it comes to zstd there
> is something that comes in play: we could set some bits related to
> dictionnaries at tuple level.  Not sure if this is the best design or
> if using an attribute-level option is more adapted (for example a
> JSONB blob could be applied as an attribute with common keys in a
> dictionnary saving a lot of on-disk space even before compression),
> but keeping some bits free in the 16-byte header leaves this option
> open with a new vartag_external.  Saying that, zstd is good enough
> that I strongly suspect that we would not regret it for quite a few
> years.  One issue that has pushed towards the addition of lz4 as an
> option for toast compression is that pglz was worse in terms of CPU
> cost.  zlib is also more expensive than lz4 or zstd, especially at
> very high compression level for usually little compression gains.
>
> > 2. And re: "as long as the TOAST value is 32 bits" — are you referring to
> > the 30-bit extsize field in va_extinfo (i.e., avoid stealing bits from
> > extsize for method encoding)?
>
> I mean extending the TOAST value to 8 bytes, as per the following
> issues:
> https://www.postgresql.org/message-id/764273.1669674269%40sss.pgh.pa.us
> https://commitfest.postgresql.org/patch/5830/
>
> > *Key findings (i guess well known at this point):*
> > - ZSTD excels for repetitive/pattern-heavy data (6.7x better than PGLZ)
> > - For low-redundancy data (MD5 hashes), ZSTD still achieves ~2x better
> > - The T4 result showing zstd as "worse" is not about compression quality
> -
> > it's about missing inline storage support. ZSTD actually compresses
> better,
> > but pays unnecessary TOAST overhead.
> >
> > I'll share the detailed benchmark script with the next patch revision.
> But
> > also a potential path forward could be that we could just fully replace
> > pglz (can bring it up later in different thread)
>
> I don't think that we will ever be able to remove pglz.  It would be
> nice, as final result of course, but I also expect that not being able
> to decompress pglz data is going to lead to a lot of user pain.  That
> would be also very expensive to check at upgrade for large instances.
>
> > *On Testing and Patch Structure*
> > Agreed on both points:
> > - I'll use `compression_zstd.sql` following the `compression_lz4.sql`
> > pattern (removing the test_toast_ext module)
>
> Okay.
>
> > - I'll split the GUC refactoring into a separate preparatory patch
>
> This refactoring, if done nicely, is worth an independent piece.  It's
> something that I have actually done for the sake of the other thread,
> though the result was not really much liked by others.  Perhaps I'm
> just lacking imagination with this abstraction, and I'd surely welcome
> different ideas.
> --
> Michael
>