Thread

  1. Re: encode/decode support for base64url

    Florents Tselai <florents.tselai@gmail.com> — 2025-09-17T12:56:57Z

    On Wed, Sep 17, 2025 at 12:56 AM Masahiko Sawada <sawada.mshk@gmail.com>
    wrote:
    
    > On Wed, Aug 6, 2025 at 12:43 PM Florents Tselai
    > <florents.tselai@gmail.com> wrote:
    > >
    > >
    > >
    > > On Wed, Aug 6, 2025 at 4:34 PM Florents Tselai <
    > florents.tselai@gmail.com> wrote:
    > >>
    > >> Attaching v6 again because it wasn't picked up the last time.
    > >> Trying from Gmail's web page this time.
    > >>
    > >>
    > >>
    > >> On Tue, Aug 5, 2025 at 12:40 PM Florents Tselai <
    > florents.tselai@gmail.com> wrote:
    > >>>
    > >>>
    > >>>
    > >>> On 1 Aug 2025, at 1:13 PM, Florents Tselai <florents.tselai@gmail.com>
    > wrote:
    > >>>
    > >>>
    > >>> On Tue, Jul 29, 2025 at 3:25 PM Daniel Gustafsson <daniel@yesql.se>
    > wrote:
    > >>>>
    > >>>> > On 12 Jul 2025, at 21:40, David E. Wheeler <david@justatheory.com>
    > wrote:
    > >>>>
    > >>>> > Thank you! This looks great. The attached revision makes a a couple
    > of minor changes:
    > >>>>
    > >>>> I also had a look at this today and agree that it looks pretty close
    > to being
    > >>>> done, and a feature we IMHO would like to have.
    > >>>
    > >>>
    > >>> Thanks for having a look Daniel!
    > >>>
    > >>>>
    > >>>>
    > >>>>
    > >>>> The attached version also adds a commit message, tweaks the
    > documentation along
    > >>>> with a few small changes to error message handling etc.
    > >>>
    > >>>
    > >>> In the doc snippet
    > >>>
    > >>> > The base64url alphabet use '-' instead of '+' and '_' instead of '/'
    > and also omits the '=' padding character.
    > >>>
    > >>> Should be
    > >>>
    > >>> > The base64url alphabet uses '-' instead of '+' and '_' instead of
    > '/', and also omits the '=' padding character.
    > >>>
    > >>> I'd also add a comma before "and also"
    > >>>
    > >>>>
    > >>>> The base64 code this extends is the RFC 2045 variant while base64url
    > is based
    > >>>> on base64 from RFC 3548 (obsoleted by RFC 4648).  AFAICT this is not
    > a problem
    > >>>> here but has anyone else verified this?
    > >>>
    > >>>
    > >>> I don't see how this can be a problem in practice.
    > >>> The conversions are straightforward,
    > >>> and the codepath used with url=true is a new one and doesn't change
    > past behavior.
    > >>>
    > >>>
    > >>> Here’s a v6; necessary because func.sgml was split .
    > >>> No other changes compared to v5.
    > >
    > >
    > > v6 introduced some whitespace errors in the regression files.
    > >
    > > Here's a v7 that fixes that
    >
    > While the patch looks good to me I have one question:
    >
    > -                        errmsg("invalid symbol \"%.*s\" found while
    > decoding base64 sequence",
    > -                               pg_mblen(s - 1), s - 1)));
    > +                        errmsg("invalid symbol \"%.*s\" found while
    > decoding %s sequence",
    > +                               pg_mblen(s - 1), s - 1,
    > +                               url ? "base64url" : "base64")));
    >
    > The above change makes the error message mention the encoding name
    > properly. On the other hand, in pg_base64_decode_internal() there are
    > two places where we report invalid data and always mention 'based64'
    > in the error message:
    >
    > ereport(ERROR,
    >         (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
    >          errmsg("unexpected \"=\" while decoding base64 sequence")));
    >
    > and
    >
    > ereport(ERROR,
    >         (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
    >          errmsg("invalid base64 end sequence"),
    >          errhint("Input data is missing padding, is truncated, or is
    > otherwise corrupted.")));
    >
    > Do we need to have a similar change for these messages?
    >
    
    Good catch, Masahiko-san. They shouldn't be hardcoded either.
    
    I've updated that and also the wording in the regression tests, too.