Thread

  1. Compression (was Re: [HACKERS] varchar/char size)

    Andrew Martin <martin@biochemistry.ucl.ac.uk> — 1998-01-12T14:36:20Z

    > My CA/Ingres Admin manual points out that there is a tradeoff between
    > compressing tuples to save disk storage and the extra processing work
    > required to uncompress for use. They suggest that the only case where you
    > would consider compressing on disk is when your system is very I/O bound,
    > and you have CPU to burn.
    > 
    > The default for Ingres is to not compress anything, but you can specify
    > compression on a table-by-table basis.
    > 
    > btw, char() is a bit trickier to handle correctly if you do compress it on
    > disk, since trailing blanks must be handled correctly all the way through.
    > For example, you would want 'hi' = 'hi   ' to be true, which is not a
    > requirement for varchar().
    > 
    >                                                         - Tom
    
    Anybody thought about real gzip style compression? There's a specialiased
    RDBMS called Iditis (written specifically for one task) which, like
    PostgreSQL stores data at the file level and uses a gzip-based library
    to access the files. I gather this is transparent to the software. Has
    anyone thought of anything equivalent for PG/SQL?
    
    To be honest I haven't looked into how Iditis does it (it's a commercial
    program and I don't have the source). I don't actually see how this
    could be done for small writes of data - how does it build the lookup
    tables for the compression? However, it might be worth considering for
    use with the text field type.
    
    Andrew
    ----------------------------------------------------------------------------
    Dr. Andrew C.R. Martin                             University College London
    EMAIL: (Work) martin@biochem.ucl.ac.uk    (Home) andrew@stagleys.demon.co.uk
    URL:   http://www.biochem.ucl.ac.uk/~martin
    Tel:   (Work) +44(0)171 419 3890                    (Home) +44(0)1372 275775