Thread

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Replace the "New Linear" GiST split algorithm for boxes and points with a

  1. Range Types

    Jeff Davis <pgsql@j-davis.com> — 2011-08-23T17:23:28Z

    Attached is the latest version of the Range Types patch. I will get it
    into better shape before the commitfest, but wanted to put up a draft in
    case anyone had comments on the TODO items.
    
    Changes:
    
      * Uses BTree opclass rather than compare function.
      * Collation specified at type definition time.
      * Various fixes.
    
    TODO:
    
      * Should the catalog hold the opclass or the opfamily? This doesn't
    affect much, but I wasn't sure which to actually store in the catalog.
    
      * Use Robert Haas' suggestion for auto-generating constructors with
    the same name as the range type, e.g. "int8range(1,10,'[]')", where the
    third argument defaults to '[)'. This allows better type inference for
    constructors, especially when there are multiple range types over the
    same base type (and collation is a common case of this). I believe this
    was the best idea after significant discussion:
    http://archives.postgresql.org/pgsql-hackers/2011-06/msg02046.php
    http://archives.postgresql.org/pgsql-hackers/2011-07/msg00210.php
    
      * Send/recv functions
    
      * cleanup
    
      * documentation updates
    
    Regards,
    	Jeff Davis
    
  2. Re: Range Types

    Jeff Davis <pgsql@j-davis.com> — 2011-09-01T07:32:25Z

    Updated patch attached.
    
    Changes:
    
    1. Now supports new constructor scheme. If you define a range foorange,
    you get the following constructors:
     * foorange() -- produces empty foorange
     * foorange(S) -- produces singleton range [S]
     * foorange(L,B) -- produces range [L,B)
     * foorange(L,B,'(]') -- produces range (L,B]
    
    Actually, the two-argument form uses a special "default_flags" that can
    be specified at creation time, and that defaults to '[)'.
    
    The way I accomplish this is by generating 4 functions at definition
    time -- a little ugly, and I am open to suggestions. I ran into a
    problem using the default argument as Robert suggested because
    pg_node_tree doesn't have a working input function (intentionally so),
    so I couldn't get the built-in range types to work with initdb. The
    constructors all essentially point to the same C function, aside from
    some indirection that I did to avoid excessive complaining from the
    opr_sanity test. Again, suggestions welcome.
    
    2. Documentation has been updated.
    
    3. Now there's support for multiple range types over a single base type,
    e.g. two text ranges using different collations.
    
    TODO:
    
    There is still some cleanup to do, e.g. pg_dump. I'd like to get some
    feedback to stabilize the user-facing behavior before I put too much
    effort into the code cleanup (which shouldn't take long, but I just
    don't want to work toward a moving target).
    
    Regards,
    	Jeff Davis
    
  3. Re: Range Types

    Jeff Davis <pgsql@j-davis.com> — 2011-09-13T08:41:36Z

    Another updated patch is attached.
    
    Changes:
    
      * support for send/recv
      * significant cleanup and fixes
      * test improvements
    
    TODO:
    
      * pg_dump support. This requires outputting collation names and
    opclass names (because those are part of the range type definition).
    Currently, that's only done for indexes through special functions in
    ruleutils.c. Should I define such functions there as well, or is there a
    simpler approach? Also, I have to filter out the generated constructor
    functions because those are created internally when defining a new range
    type.
      * some error messages should be improved
      * Originally, I wasn't sure whether to define a RangeCoerceExpr
    (similar to ArrayCoerceExpr), because the only use I had was for typmod.
    But that is necessary for casts as well, so I'll go ahead and do that
    and we'll get both casts and typmod for range types.
      * I think I should avoid some syscache lookups for some of the generic
    range functions. Right now they are done for every invocation, but it
    would be pretty simple to avoid lookups when looking up the same range
    type as last time.
    
    Review questions:
    
      * Do we like the new constructor behavior from the users' standpoint?
      * Right now, the patch accomplishes that behavior by generating
    several constructor functions every time a new range type is defined. Is
    that acceptable? Is there a better way?
    
    Regards,
    	Jeff Davis
    
  4. Re: Range Types - typo + NULL string constructor

    Erik Rijkers <er@xs4all.nl> — 2011-09-18T16:08:22Z

    On Tue, September 13, 2011 10:41, Jeff Davis wrote:
    > Another updated patch is attached.
    >
    
    Hi,
    
    Below are 2 changes.  The first change is an elog saying 'lower' instead of 'upper'.
    
    The second change is less straightforward, but I think it should be changed too:
    
    Rangetypes as it stands uses NULL to indicate INF or -INF:
    
    select int4range(2, NULL);
    
     int4range
    ------------
     [ 2, INF )
    (1 row)
    
    
    but refuses to accept it in the string-form:
    
    select '[ 2 , NULL )'::int4range;
    ERROR:  NULL range boundaries are not supported
    LINE 1: select '[ 2 , NULL )'::int4range;
                   ^
    
    Second part below changes that to accept NULL for INF and -INF
    in the string-form construction. (not complete: it still is
    case-sensitive).
    
    
    Thanks,
    
    Erik Rijkers
    
    
    
    
    --- src/backend/utils/adt/rangetypes.c.orig  2011-09-18 12:35:29.000000000 +0200
    +++ src/backend/utils/adt/rangetypes.c      2011-09-18 16:03:34.000000000 +0200
    @@ -387,7 +387,7 @@
            if (empty)
                    elog(ERROR, "range is empty");
            if (upper.infinite)
    -               elog(ERROR, "range lower bound is infinite");
    +               elog(ERROR, "range upper bound is infinite");
    
            PG_RETURN_DATUM(upper.val);
     }
    @@ -1579,9 +1579,9 @@
                    fl = RANGE_EMPTY;
    
            if (!lb_quoted && strncmp(lb, "NULL", ilen) == 0)
    -               elog(ERROR, "NULL range boundaries are not supported");
    +               fl |= RANGE_LB_INF;
            if (!ub_quoted && strncmp(ub, "NULL", ilen) == 0)
    -               elog(ERROR, "NULL range boundaries are not supported");
    +               fl |= RANGE_UB_INF;
            if (!lb_quoted && strncmp(lb, "-INF", ilen) == 0)
                    fl |= RANGE_LB_INF;
            if (!ub_quoted && strncmp(ub, "INF", ilen) == 0)
    
    
    
    
  5. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-09-19T05:51:34Z

    On Sun, 2011-09-18 at 18:08 +0200, Erik Rijkers wrote:
    > Below are 2 changes.  The first change is an elog saying 'lower' instead of 'upper'.
    
    Done, thank you. New patch attached.
    
    Changes:
     * documentation fixes
     * added document for pg_range catalog
     * cleaned up errors, increased error checking
     * improved pg_dump
    
    TODO:
     * Support casts and typmod.
       - This requires adding a RangeCoerceExpr, or possibly
         overloading ArrayCoerceExpr somehow. This is likely to
         require a lot of boilerplate code and a fairly large diff.
     * Cache lookups better to avoid unnecessary SearchSysCache calls.
     * I need to find a clean way to get the operator class name in pg_dump.
    
    > Rangetypes as it stands uses NULL to indicate INF or -INF:
    > 
    > select int4range(2, NULL);
    > 
    >  int4range
    > ------------
    >  [ 2, INF )
    > (1 row)
    > 
    > 
    > but refuses to accept it in the string-form:
    > 
    > select '[ 2 , NULL )'::int4range;
    > ERROR:  NULL range boundaries are not supported
    > LINE 1: select '[ 2 , NULL )'::int4range;
    
    I think this might require more opinions. There is a trade-off here
    between convenience and confusion: accepting NULL is convenient in the
    constructors, because it avoids the need to have extra constructors just
    for unbounded ranges; but could lead to confusion between NULL and INF
    (which are not the same).
    
    In the string form, it doesn't add any convenience to accept NULL; but
    as you point out, it seems inconsistent without it.
    
    Thoughts?
    
    Regards,
    	Jeff Davis
    
  6. Re: Range Types - typo + NULL string constructor

    Pavel Stehule <pavel.stehule@gmail.com> — 2011-09-19T07:30:11Z

    hello
    
    sorry for late assign to discussion.
    
    I don't think so using NULL instead INF is a good idea.
    
    Regards
    
    Pavel Stehule
    
    2011/9/19 Jeff Davis <pgsql@j-davis.com>:
    > On Sun, 2011-09-18 at 18:08 +0200, Erik Rijkers wrote:
    >> Below are 2 changes.  The first change is an elog saying 'lower' instead of 'upper'.
    >
    > Done, thank you. New patch attached.
    >
    > Changes:
    >  * documentation fixes
    >  * added document for pg_range catalog
    >  * cleaned up errors, increased error checking
    >  * improved pg_dump
    >
    > TODO:
    >  * Support casts and typmod.
    >   - This requires adding a RangeCoerceExpr, or possibly
    >     overloading ArrayCoerceExpr somehow. This is likely to
    >     require a lot of boilerplate code and a fairly large diff.
    >  * Cache lookups better to avoid unnecessary SearchSysCache calls.
    >  * I need to find a clean way to get the operator class name in pg_dump.
    >
    >> Rangetypes as it stands uses NULL to indicate INF or -INF:
    >>
    >> select int4range(2, NULL);
    >>
    >>  int4range
    >> ------------
    >>  [ 2, INF )
    >> (1 row)
    >>
    >>
    >> but refuses to accept it in the string-form:
    >>
    >> select '[ 2 , NULL )'::int4range;
    >> ERROR:  NULL range boundaries are not supported
    >> LINE 1: select '[ 2 , NULL )'::int4range;
    >
    > I think this might require more opinions. There is a trade-off here
    > between convenience and confusion: accepting NULL is convenient in the
    > constructors, because it avoids the need to have extra constructors just
    > for unbounded ranges; but could lead to confusion between NULL and INF
    > (which are not the same).
    >
    > In the string form, it doesn't add any convenience to accept NULL; but
    > as you point out, it seems inconsistent without it.
    >
    > Thoughts?
    >
    > Regards,
    >        Jeff Davis
    >
    >
    > --
    > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
    > To make changes to your subscription:
    > http://www.postgresql.org/mailpref/pgsql-hackers
    >
    >
    
    
  7. Re: Range Types - typo + NULL string constructor

    Robert Haas <robertmhaas@gmail.com> — 2011-09-19T13:33:29Z

    On Mon, Sep 19, 2011 at 1:51 AM, Jeff Davis <pgsql@j-davis.com> wrote:
    >> select '[ 2 , NULL )'::int4range;
    >> ERROR:  NULL range boundaries are not supported
    >> LINE 1: select '[ 2 , NULL )'::int4range;
    >
    > I think this might require more opinions. There is a trade-off here
    > between convenience and confusion: accepting NULL is convenient in the
    > constructors, because it avoids the need to have extra constructors just
    > for unbounded ranges; but could lead to confusion between NULL and INF
    > (which are not the same).
    
    I agree with this line of reasoning.  I think we will be making pain
    for ourselves if we need to invent a bunch more constructors just to
    have a way of indicating an unbounded range, but OTOH I don't see any
    compelling reason why the type input function needs to accept N-U-L-L.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  8. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-09-19T15:23:34Z

    On Sep19, 2011, at 15:33 , Robert Haas wrote:
    > On Mon, Sep 19, 2011 at 1:51 AM, Jeff Davis <pgsql@j-davis.com> wrote:
    >>> select '[ 2 , NULL )'::int4range;
    >>> ERROR:  NULL range boundaries are not supported
    >>> LINE 1: select '[ 2 , NULL )'::int4range;
    >> 
    >> I think this might require more opinions. There is a trade-off here
    >> between convenience and confusion: accepting NULL is convenient in the
    >> constructors, because it avoids the need to have extra constructors just
    >> for unbounded ranges; but could lead to confusion between NULL and INF
    >> (which are not the same).
    > 
    > I agree with this line of reasoning.  I think we will be making pain
    > for ourselves if we need to invent a bunch more constructors just to
    > have a way of indicating an unbounded range, but OTOH I don't see any
    > compelling reason why the type input function needs to accept N-U-L-L.
    
    The one reason I can see in favour of supporting N-U-L-L there is 
    compatibility with arrays. I've recently had the questionable pleasure
    of writing PHP functions to parse and emit our textual representations of
    arrays, records, dates and timestamps. After that experience, I feel that
    the number of similar-yet-slightly-different textual input output format
    for non-primitive types is already excessive, and any further additions
    should be modeled after some existing ones.
    
    (And BTW, why in heavens sake, is date and time input and output
    asymmetric for some DateStyle settings? Asymmetric like in you need to
    send one format, but get back another...)
    
    best regards,
    Florian Pflug
    
    
    
    
  9. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-09-19T15:46:27Z

    On Mon, 2011-09-19 at 17:23 +0200, Florian Pflug wrote:
    > The one reason I can see in favour of supporting N-U-L-L there is 
    > compatibility with arrays.
    
    But arrays actually do store and produce NULLs; ranges don't.
    
    >  I've recently had the questionable pleasure
    > of writing PHP functions to parse and emit our textual representations of
    > arrays, records, dates and timestamps. After that experience, I feel that
    > the number of similar-yet-slightly-different textual input output format
    > for non-primitive types is already excessive, and any further additions
    > should be modeled after some existing ones.
    
    I'm not clear on how accepting "NULL" would really save effort. With
    ranges, the brackets have an actual meaning (inclusivity), and empty
    ranges have no brackets at all. So I don't think it's going to be easy
    to write one function to parse everything.
    
    What about binary formats?
    
    Regards,
    	Jeff Davis
    
    
    
  10. Re: Range Types - typo + NULL string constructor

    Kevin Grittner <kevin.grittner@wicourts.gov> — 2011-09-19T16:00:18Z

    Robert Haas <robertmhaas@gmail.com> wrote:
    > Jeff Davis <pgsql@j-davis.com> wrote:
    >>> select '[ 2 , NULL )'::int4range;
    >>> ERROR:  NULL range boundaries are not supported
    >>> LINE 1: select '[ 2 , NULL )'::int4range;
    >>
    >> I think this might require more opinions. There is a trade-off
    >> here between convenience and confusion: accepting NULL is
    >> convenient in the constructors, because it avoids the need to
    >> have extra constructors just for unbounded ranges; but could lead
    >> to confusion between NULL and INF (which are not the same).
    > 
    > I agree with this line of reasoning.  I think we will be making
    > pain for ourselves if we need to invent a bunch more constructors
    > just to have a way of indicating an unbounded range, but OTOH I
    > don't see any compelling reason why the type input function needs
    > to accept N-U-L-L.
     
    FWIW, the existing semantics of NULL include not only "UNKNOWN" but
    also "NOT APPLICABLE".  It seems fairly natural to think of a range
    as being unbounded if the boundary limit is "not applicable".
     
    On a practical level, our shop is already effectively doing this. 
    We have several tables where part of the primary key is "effective
    date" and there is a null capable "expiration date" -- with a NULL
    meaning that no expiration date has been set.  It would be nice to
    be able to have a "generated column" function which used these two
    dates to build a range for exclusion constraints and such.
     
    -Kevin
    
    
  11. Re: Range Types - typo + NULL string constructor

    Robert Haas <robertmhaas@gmail.com> — 2011-09-19T16:26:58Z

    On Mon, Sep 19, 2011 at 11:23 AM, Florian Pflug <fgp@phlo.org> wrote:
    > On Sep19, 2011, at 15:33 , Robert Haas wrote:
    >> On Mon, Sep 19, 2011 at 1:51 AM, Jeff Davis <pgsql@j-davis.com> wrote:
    >>>> select '[ 2 , NULL )'::int4range;
    >>>> ERROR:  NULL range boundaries are not supported
    >>>> LINE 1: select '[ 2 , NULL )'::int4range;
    >>>
    >>> I think this might require more opinions. There is a trade-off here
    >>> between convenience and confusion: accepting NULL is convenient in the
    >>> constructors, because it avoids the need to have extra constructors just
    >>> for unbounded ranges; but could lead to confusion between NULL and INF
    >>> (which are not the same).
    >>
    >> I agree with this line of reasoning.  I think we will be making pain
    >> for ourselves if we need to invent a bunch more constructors just to
    >> have a way of indicating an unbounded range, but OTOH I don't see any
    >> compelling reason why the type input function needs to accept N-U-L-L.
    >
    > The one reason I can see in favour of supporting N-U-L-L there is
    > compatibility with arrays. I've recently had the questionable pleasure
    > of writing PHP functions to parse and emit our textual representations of
    > arrays, records, dates and timestamps. After that experience, I feel that
    > the number of similar-yet-slightly-different textual input output format
    > for non-primitive types is already excessive, and any further additions
    > should be modeled after some existing ones.
    
    Well, I'm not violently opposed to accepting NULL to mean an unbounded
    range.  The semantics of "no bound at all" and "an unknown bound"
    (i.e. NULL) are pretty close, as Kevin also points out downthread.
    But I think the way Jeff actually did it is OK, too.  What I really
    care about is that we don't talk ourselves into needing a zillion
    constructor functions.  Making things work with a single constructor
    function seems to me to simplify life quite a bit, and allowing there
    seems essential for that.
    
    (I am also vaguely wondering what happens if if you have a text
    range.... is (nubile, null) ambiguous?)
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  12. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-09-19T16:32:44Z

    On Sep19, 2011, at 17:46 , Jeff Davis wrote:
    > On Mon, 2011-09-19 at 17:23 +0200, Florian Pflug wrote:
    >> The one reason I can see in favour of supporting N-U-L-L there is 
    >> compatibility with arrays.
    > 
    > But arrays actually do store and produce NULLs; ranges don't.
    
    Hm, yeah, granted. But OTOH, clients will very likely use NULL, null, nil
    or something similar as a bound to represent unbounded ranges. And those
    will probably already be mapped to SQL's NULL. So in practice, people will
    think of unbounded ranges having the (lower or upper) bound NULL I think.
    
    >> I've recently had the questionable pleasure
    >> of writing PHP functions to parse and emit our textual representations of
    >> arrays, records, dates and timestamps. After that experience, I feel that
    >> the number of similar-yet-slightly-different textual input output format
    >> for non-primitive types is already excessive, and any further additions
    >> should be modeled after some existing ones.
    > 
    > I'm not clear on how accepting "NULL" would really save effort. With
    > ranges, the brackets have an actual meaning (inclusivity), and empty
    > ranges have no brackets at all. So I don't think it's going to be easy
    > to write one function to parse everything.
    
    No, but more similar the format are the easier it gets to at least factor
    the hairy parts of such a parser into a common subroutine. Assume that we
    don't support NULL as an alias for INF. What would then be the result of
    
      '[A,NULL)'::textrange?
    
    Presumably, it'd be the same as textrange('A','NULL','[)'). Which think
    is a bit surprising, since '[A,NULL]'::text[] produces ARRAY['A',NULL],
    *NOT* ARRAY['A','NULL'].
    
    BTW, we currently represent infinity for floating point values as
    'Infinity', not 'INF'. Shouldn't we do the same for ranges, i.e. make
    
      int4range(0,NULL,'[)')::text
    
    return 
    
      '[0,Infinity)'?
    
    best regards,
    Florian Pflug
    
    
    
  13. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-09-19T16:48:42Z

    On Mon, 2011-09-19 at 11:00 -0500, Kevin Grittner wrote:
    > On a practical level, our shop is already effectively doing this. 
    > We have several tables where part of the primary key is "effective
    > date" and there is a null capable "expiration date" -- with a NULL
    > meaning that no expiration date has been set.  It would be nice to
    > be able to have a "generated column" function which used these two
    > dates to build a range for exclusion constraints and such.
    
    Agreed, that's a good convenience argument for accepting NULL boundaries
    in the constructors.
    
    Underneath though, we don't use NULL semantics (because they don't make
    sense for ranges -- in fact, avoiding the need to constantly
    special-case NULLs is one of the reasons to use range types). So, we
    want to avoid confusion where possible.
    
    Regards,
    	Jeff Davis
    
    
    
  14. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-09-21T07:23:39Z

    On Mon, 2011-09-19 at 18:32 +0200, Florian Pflug wrote:
    > No, but more similar the format are the easier it gets to at least factor
    > the hairy parts of such a parser into a common subroutine. Assume that we
    > don't support NULL as an alias for INF. What would then be the result of
    > 
    >   '[A,NULL)'::textrange?
    
    I think that the range input should *parse* NULL in a similar way, but
    reject it. So, to make it the range between two definite strings, you'd
    do:
    
      '[A,"NULL")'::textrange
    
    which would be equal to textrange('A','NULL','[)'). Without the quotes,
    it would detect the NULL, and give an error. Open to suggestion here,
    though.
    
    > Presumably, it'd be the same as textrange('A','NULL','[)'). Which think
    > is a bit surprising, since '[A,NULL]'::text[] produces ARRAY['A',NULL],
    > *NOT* ARRAY['A','NULL'].
    > 
    > BTW, we currently represent infinity for floating point values as
    > 'Infinity', not 'INF'. Shouldn't we do the same for ranges, i.e. make
    > 
    >   int4range(0,NULL,'[)')::text
    > 
    > return 
    > 
    >   '[0,Infinity)'?
    
    I'm open to that, if you think it's an improvement I'll do it (but we
    should probably pick one identifiable string and stick with it). What
    I'd like to avoid is adding to the NULL/infinity confusion.
    
    Regards,
    	Jeff Davis
    
    
    
  15. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-09-21T07:29:47Z

    On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote:
    > What I really
    > care about is that we don't talk ourselves into needing a zillion
    > constructor functions.  Making things work with a single constructor
    > function seems to me to simplify life quite a bit, and allowing there
    > seems essential for that.
    
    I think we pretty much all agree on that. However, you did see the note
    about the difficulty of using default parameters in built-in functions,
    right?
    
    I ultimately ended up with 4 constructors, each with the same name but
    0, 1, 2, and 3 parameters. Suggestions welcome.
    
    > (I am also vaguely wondering what happens if if you have a text
    > range.... is (nubile, null) ambiguous?)
    
    There are a few ways to handle that. I would lean toward parsing the
    NULL as a special keyword, and then rejecting it (does it matter if it's
    upper case?).
    
    Regards,
    	Jeff Davis
    
    
    
    
  16. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-09-21T11:24:03Z

    On Sep21, 2011, at 09:23 , Jeff Davis wrote:
    > On Mon, 2011-09-19 at 18:32 +0200, Florian Pflug wrote:
    >> No, but more similar the format are the easier it gets to at least factor
    >> the hairy parts of such a parser into a common subroutine. Assume that we
    >> don't support NULL as an alias for INF. What would then be the result of
    >> 
    >>  '[A,NULL)'::textrange?
    > 
    > I think that the range input should *parse* NULL in a similar way, but
    > reject it. So, to make it the range between two definite strings, you'd
    > do:
    > 
    >  '[A,"NULL")'::textrange
    > 
    > which would be equal to textrange('A','NULL','[)'). Without the quotes,
    > it would detect the NULL, and give an error. Open to suggestion here,
    > though.
    
    Hm, that seems like a reasonable compromise. As long as range types and
    arrays agree on the same basic lexical rules regarding quoting and whitespace
    (i.e. that spaces outside of double-quotes are non-significant, that keywords
    like NULL and INF/Infinity are case-insensitive, ...) I'm happy I guess.
    
    >> BTW, we currently represent infinity for floating point values as
    >> 'Infinity', not 'INF'. Shouldn't we do the same for ranges, i.e. make
    >> 
    >>  int4range(0,NULL,'[)')::text
    >> 
    >> return 
    >> 
    >>  '[0,Infinity)'?
    > 
    > I'm open to that, if you think it's an improvement I'll do it (but we
    > should probably pick one identifiable string and stick with it). What
    > I'd like to avoid is adding to the NULL/infinity confusion.
    
    I've thought about this some more, and came to realize that the question
    here really is whether
    
      floatrange(0, 'Infinity'::float, '[)')
    
    and
    
      floatrange(0, NULL, '[)')
    
    are the same thing or not. If they're not, then obviously using "Infinity"
    to represent omitted bounds is going to be very confusing. If they are,
    then using "Infinity" seems preferable. Maybe boundaries should be restricted
    to numeric float values (i.e. +/-Infinity and NaN should be rejected), though
    I dunno if the range type infrastructure supports that. Thoughts?
    
    best regards,
    Florian Pflug
    
    
    
    
    
  17. Re: Range Types - typo + NULL string constructor

    Robert Haas <robertmhaas@gmail.com> — 2011-09-21T12:00:26Z

    On Wed, Sep 21, 2011 at 3:29 AM, Jeff Davis <pgsql@j-davis.com> wrote:
    > On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote:
    >> What I really
    >> care about is that we don't talk ourselves into needing a zillion
    >> constructor functions.  Making things work with a single constructor
    >> function seems to me to simplify life quite a bit, and allowing there
    >> seems essential for that.
    >
    > I think we pretty much all agree on that. However, you did see the note
    > about the difficulty of using default parameters in built-in functions,
    > right?
    >
    > I ultimately ended up with 4 constructors, each with the same name but
    > 0, 1, 2, and 3 parameters. Suggestions welcome.
    >
    >> (I am also vaguely wondering what happens if if you have a text
    >> range.... is (nubile, null) ambiguous?)
    >
    > There are a few ways to handle that. I would lean toward parsing the
    > NULL as a special keyword, and then rejecting it (does it matter if it's
    > upper case?).
    
    Boy, that seems really weird to me.  If you're going to do it, it
    ought to be case-insensitive, but I think detecting the case only for
    the purpose of rejecting it is probably a mistake.  I mean, if
    (nubile, nutty) is OK, then (nubile, null) and (null, nutty) don't
    really seem like they ought to be any different.  Otherwise, anyone
    who wants to construct these strings programatically is going to need
    to escape everything and always write ("cat","dog") or however you do
    that, and that seems like an unnecessary imposition.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  18. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-09-21T12:41:15Z

    On Sep21, 2011, at 14:00 , Robert Haas wrote:
    > On Wed, Sep 21, 2011 at 3:29 AM, Jeff Davis <pgsql@j-davis.com> wrote:
    >> On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote:
    >>> (I am also vaguely wondering what happens if if you have a text
    >>> range.... is (nubile, null) ambiguous?)
    >> 
    >> There are a few ways to handle that. I would lean toward parsing the
    >> NULL as a special keyword, and then rejecting it (does it matter if it's
    >> upper case?).
    > 
    > Boy, that seems really weird to me.  If you're going to do it, it
    > ought to be case-insensitive, but I think detecting the case only for
    > the purpose of rejecting it is probably a mistake.  I mean, if
    > (nubile, nutty) is OK, then (nubile, null) and (null, nutty) don't
    > really seem like they ought to be any different.
    
    But that's exactly how arrays behave too. '{null,nutty}' is interpreted
    as ARRAY[NULL,'nutty'] while '{nubile,nutty}' is interpreted as
    ARRAY['nubile','nutty'].
    
    > Otherwise, anyone
    > who wants to construct these strings programatically is going to need
    > to escape everything and always write ("cat","dog") or however you do
    > that, and that seems like an unnecessary imposition.
    
    Unless you fully depart from what arrays you, you'll have to do that anyway
    because leading and trailing spaces aren't considered to be significant in
    non-quoted elements. In other words, '( cat , dog )' represents
    textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()').
    
    Also, as long as we need to recognize at least one special value meaning
    a non-existing bound ('INF' or 'Infinity' or whatever), I don't see a way
    around the need for quotes in the general case. Well, expect making the
    representation of
    
      range(X, NULL, '[)') be '[X)',
    
    the one of
    
      range(NULL, X, '(]') be '(X]'
    
    and the one of
    
      range(NULL, NULL, '()') be '()',
    
    but I'm not sure that's an improvement. And even if it was, you'd still
    need to quote X if it contained one of "(",")","[","]" or ",". So most
    client would probably still choose to quote unconditionally, instead of
    detecting whether it was necessary or not.
    
    best regards,
    Florian Pflug
    
    
    
  19. Re: Range Types - typo + NULL string constructor

    Robert Haas <robertmhaas@gmail.com> — 2011-09-21T12:47:00Z

    On Wed, Sep 21, 2011 at 8:41 AM, Florian Pflug <fgp@phlo.org> wrote:
    >> Boy, that seems really weird to me.  If you're going to do it, it
    >> ought to be case-insensitive, but I think detecting the case only for
    >> the purpose of rejecting it is probably a mistake.  I mean, if
    >> (nubile, nutty) is OK, then (nubile, null) and (null, nutty) don't
    >> really seem like they ought to be any different.
    >
    > But that's exactly how arrays behave too. '{null,nutty}' is interpreted
    > as ARRAY[NULL,'nutty'] while '{nubile,nutty}' is interpreted as
    > ARRAY['nubile','nutty'].
    
    Oh.  Well, never mind then.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  20. Re: Range Types - typo + NULL string constructor

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-09-21T14:41:39Z

    Florian Pflug <fgp@phlo.org> writes:
    > On Sep21, 2011, at 14:00 , Robert Haas wrote:
    >> Otherwise, anyone
    >> who wants to construct these strings programatically is going to need
    >> to escape everything and always write ("cat","dog") or however you do
    >> that, and that seems like an unnecessary imposition.
    
    > Unless you fully depart from what arrays you, you'll have to do that anyway
    > because leading and trailing spaces aren't considered to be significant in
    > non-quoted elements. In other words, '( cat , dog )' represents
    > textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()').
    
    Keep in mind that the array I/O behavior is widely considered to suck.
    When we defined the record I/O behavior, we did not emulate that
    whitespace weirdness, nor a number of other weirdnesses.  I would argue
    that ranges ought to model their I/O behavior on records not arrays,
    because that's not as much of a legacy syntax.
    
    > Also, as long as we need to recognize at least one special value meaning
    > a non-existing bound ('INF' or 'Infinity' or whatever), I don't see a way
    > around the need for quotes in the general case.
    
    Right.  In the record case, we used an empty string for NULL, and then
    had to insist on quotes for actual empty strings.
    
    			regards, tom lane
    
    
  21. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-09-21T15:20:10Z

    On Sep21, 2011, at 16:41 , Tom Lane wrote:
    > Florian Pflug <fgp@phlo.org> writes:
    >> On Sep21, 2011, at 14:00 , Robert Haas wrote:
    >>> Otherwise, anyone
    >>> who wants to construct these strings programatically is going to need
    >>> to escape everything and always write ("cat","dog") or however you do
    >>> that, and that seems like an unnecessary imposition.
    > 
    >> Unless you fully depart from what arrays you, you'll have to do that anyway
    >> because leading and trailing spaces aren't considered to be significant in
    >> non-quoted elements. In other words, '( cat , dog )' represents
    >> textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()').
    > 
    > Keep in mind that the array I/O behavior is widely considered to suck.
    > When we defined the record I/O behavior, we did not emulate that
    > whitespace weirdness, nor a number of other weirdnesses.  I would argue
    > that ranges ought to model their I/O behavior on records not arrays,
    > because that's not as much of a legacy syntax.
    
    Interesting. Funnily enough, I always assumed it was the other way around.
    Probably because I don't care much for the empty-unquoted-string-is-NULL
    behaviour of records. But leaving that personal opinion aside, yeah, in that
    case we should make range I/O follow record I/O.
    
    >> Also, as long as we need to recognize at least one special value meaning
    >> a non-existing bound ('INF' or 'Infinity' or whatever), I don't see a way
    >> around the need for quotes in the general case.
    > 
    > Right.  In the record case, we used an empty string for NULL, and then
    > had to insist on quotes for actual empty strings.
    
    Hm, so we'd have
    
      '(X,)' for range(X, NULL, '()'),
      '(,X)' for range(NULL, X, '()') and
      '(,)' for range(NULL, NULL, '()').
    
    We'd then have the choice of either declaring
    
      '(X,]' to mean '(X,)',
      '[,X)' to mean '(,X)' and
      '[,]' to mean '(,)'
    
    or to forbid the use of '[' and ']' for unspecified bounds.
    
    (Leaving out the ',' in the case of only one bound as in my reply to Robert's
    mail somewhere else in this thread doesn't actually work, since it'd be 
    ambiguous whether '(X)' means range(X, NULL, '()') or range(NULL, X, '()').)
    
    One nice property is that, apart from the different brackets used, this
    representation is identical to the one used by records while still avoiding
    the infinity vs. NULL confusion.
    
    best regards,
    Florian Pflug
    
    
    
  22. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-09-21T17:00:39Z

    On Wed, 2011-09-21 at 17:20 +0200, Florian Pflug wrote:
    > Hm, so we'd have
    > 
    >   '(X,)' for range(X, NULL, '()'),
    >   '(,X)' for range(NULL, X, '()') and
    >   '(,)' for range(NULL, NULL, '()').
    > 
    > We'd then have the choice of either declaring
    > 
    >   '(X,]' to mean '(X,)',
    >   '[,X)' to mean '(,X)' and
    >   '[,]' to mean '(,)'
    > 
    > or to forbid the use of '[' and ']' for unspecified bounds.
    
    Right now, I just canonicalize it to round brackets if infinite. Seems
    pointless to reject it, but I can if someone thinks it's better.
    
    > (Leaving out the ',' in the case of only one bound as in my reply to Robert's
    > mail somewhere else in this thread doesn't actually work, since it'd be 
    > ambiguous whether '(X)' means range(X, NULL, '()') or range(NULL, X, '()').)
    > 
    > One nice property is that, apart from the different brackets used, this
    > representation is identical to the one used by records while still avoiding
    > the infinity vs. NULL confusion.
    
    OK, I like that. Slightly strange to require quoting empty strings, but
    not stranger than the alternatives.
    
    While we're at it, any suggestions on the text representation of an
    empty range?
    
    Regards,
    	Jeff Davis
    
    
    
  23. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-09-21T17:02:32Z

    On Wed, 2011-09-21 at 13:24 +0200, Florian Pflug wrote:
    > I've thought about this some more, and came to realize that the question
    > here really is whether
    > 
    >   floatrange(0, 'Infinity'::float, '[)')
    > 
    > and
    > 
    >   floatrange(0, NULL, '[)')
    > 
    > are the same thing or not.
    
    The unbounded side of a range is never equal to a value in the data
    type's domain, so no, it's not the same.
    
    I think that we pretty much settled on just using an empty string for
    infinity in the other thread, right? So that makes this a non-issue.
    
    Regards,
    	Jeff Davis
    
    
    
  24. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-09-22T00:31:45Z

    On Sep21, 2011, at 19:00 , Jeff Davis wrote:
    > While we're at it, any suggestions on the text representation of an
    > empty range?
    
    My personal favourite would be '0', since it resembles the symbol used
    for empty sets in mathematics, and we already decided to use mathematical
    notation for ranges.
    
    If we're concerned that most of our users won't get that, then 'empty'
    would be a viable alternative I think.
    
    From a consistency POV it'd make sense to use a bracket-based syntax
    also for empty ranges. But the only available options would be '()' and '[]',
    which are too easily confused with '(,)' and '[,]' (which we already
    decided should represent the full range).
    
    best regards,
    Florian Pflug
    
    
    
  25. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-09-22T00:40:29Z

    On Sep21, 2011, at 19:02 , Jeff Davis wrote:
    > On Wed, 2011-09-21 at 13:24 +0200, Florian Pflug wrote:
    >> I've thought about this some more, and came to realize that the question
    >> here really is whether
    >> 
    >>  floatrange(0, 'Infinity'::float, '[)')
    >> 
    >> and
    >> 
    >>  floatrange(0, NULL, '[)')
    >> 
    >> are the same thing or not.
    > 
    > The unbounded side of a range is never equal to a value in the data
    > type's domain, so no, it's not the same.
    > 
    > I think that we pretty much settled on just using an empty string for
    > infinity in the other thread, right? So that makes this a non-issue.
    
    Agreed.
    
    best regards,
    Florian Pflug
    
    
    
  26. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-09-22T06:01:22Z

    On Thu, 2011-09-22 at 02:31 +0200, Florian Pflug wrote:
    > My personal favourite would be '0', since it resembles the symbol used
    > for empty sets in mathematics, and we already decided to use mathematical
    > notation for ranges.
    > 
    > If we're concerned that most of our users won't get that, then 'empty'
    > would be a viable alternative I think.
    > 
    > From a consistency POV it'd make sense to use a bracket-based syntax
    > also for empty ranges. But the only available options would be '()' and '[]',
    > which are too easily confused with '(,)' and '[,]' (which we already
    > decided should represent the full range).
    
    Yes, I think () is too close to (,).
    
    Brainstorming so far:
     0       : simple, looks like the empty set symbol
     empty   : simple
     <empty> : a little more obvious that it's special
     <>      : visually looks empty
     -       : also looks empty
     {}      : mathematical notation, but doesn't quite fit ranges
    
    I don't have a strong opinion. I'd be OK with any of those.
    
    Regards,
    	Jeff Davis
    
    
    
    
  27. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-02T06:12:03Z

    On Wed, 2011-09-21 at 10:41 -0400, Tom Lane wrote:
    > Florian Pflug <fgp@phlo.org> writes:
    > > On Sep21, 2011, at 14:00 , Robert Haas wrote:
    > >> Otherwise, anyone
    > >> who wants to construct these strings programatically is going to need
    > >> to escape everything and always write ("cat","dog") or however you do
    > >> that, and that seems like an unnecessary imposition.
    > 
    > > Unless you fully depart from what arrays you, you'll have to do that anyway
    > > because leading and trailing spaces aren't considered to be significant in
    > > non-quoted elements. In other words, '( cat , dog )' represents
    > > textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()').
    > 
    > Keep in mind that the array I/O behavior is widely considered to suck.
    > When we defined the record I/O behavior, we did not emulate that
    > whitespace weirdness, nor a number of other weirdnesses.  I would argue
    > that ranges ought to model their I/O behavior on records not arrays,
    > because that's not as much of a legacy syntax.
    
    Done. Now range types more closely resemble records in parsing behavior.
    Patch attached.
    
    Changes:
     * new parser + doc changes
     * merged with master
     * pg_dump now work
     * various cleanup
    
    TODO:
     * ultimately, there should be a simple cache to avoid repeated syscache
    lookups. I have put this off mainly to avoid premature optimization, but
    that code has been pretty stable for a while
     * support casts and typmod -- this requires the work for the
    RangeCoerceExpr that I mentioned before
    
    Regards,
    	Jeff Davis
    
  28. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-10-02T09:32:29Z

    On Oct2, 2011, at 08:12 , Jeff Davis wrote:
    > Done. Now range types more closely resemble records in parsing behavior.
    > Patch attached.
    
    Cool!
    
    Looking at the patch, I noticed that it's possible to specify the default
    boundaries ([], [), (] or ()) per individual float type with the
    DEFAULT_FLAGS clause of CREATE TYPE .. AS RANGE. I wonder if that doesn't
    do more harm then good - it makes it impossible to deduce the meaning of
    e.g. numericrange(1.0, 2.0) without looking up the definition of numericrange.
    
    I suggest we pick one set of default boundaries, ideally '[)' since that
    is what all the built-in canonization functions produce, and stick with it.
    
    best regards,
    Florian Pflug
    
    
    
  29. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-02T16:01:10Z

    On Sun, 2011-10-02 at 11:32 +0200, Florian Pflug wrote:
    > Looking at the patch, I noticed that it's possible to specify the default
    > boundaries ([], [), (] or ()) per individual float type with the
    > DEFAULT_FLAGS clause of CREATE TYPE .. AS RANGE. I wonder if that doesn't
    > do more harm then good - it makes it impossible to deduce the meaning of
    > e.g. numericrange(1.0, 2.0) without looking up the definition of numericrange.
    > 
    > I suggest we pick one set of default boundaries, ideally '[)' since that
    > is what all the built-in canonization functions produce, and stick with it.
    
    That sounds reasonable to me. Unless someone objects, I'll make the
    change in the next patch.
    
    Regards,
    	Jeff Davis
    
    
    
  30. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-02T19:05:31Z

    On Sun, 2011-10-02 at 11:32 +0200, Florian Pflug wrote:
    > Looking at the patch, I noticed that it's possible to specify the default
    > boundaries ([], [), (] or ()) per individual float type with the
    > DEFAULT_FLAGS clause of CREATE TYPE .. AS RANGE. I wonder if that doesn't
    > do more harm then good - it makes it impossible to deduce the meaning of
    > e.g. numericrange(1.0, 2.0) without looking up the definition of numericrange.
    > 
    > I suggest we pick one set of default boundaries, ideally '[)' since that
    > is what all the built-in canonization functions produce, and stick with it.
    
    Done.
    
    Also, made the range parsing even more like records with more code
    copied verbatim. And fixed some parsing tests along the way.
    
    Regards,
    	Jeff Davis
    
  31. Re: Range Types - typo + NULL string constructor

    Alexander Korotkov <aekorotkov@gmail.com> — 2011-10-06T19:26:09Z

    Hi, Jeff!
    
    Heikki has recently commited my patch about picksplit for GiST on points and
    boxes:
    http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7f3bd86843e5aad84585a57d3f6b80db3c609916
    I would like to try this picksplit method on ranges. I believe that it might
    be much more efficient on highly overlapping ranges than current picksplit.
    Range types patch isn't commited, yet. So, what way of work publishing is
    prefered in this case? Add-on patch, separate branch in git or something
    else?
    
    ------
    With best regards,
    Alexander Korotkov.
    
  32. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-07T03:41:42Z

    On Thu, 2011-10-06 at 23:26 +0400, Alexander Korotkov wrote:
    > Hi, Jeff!
    > 
    > 
    > Heikki has recently commited my patch about picksplit for GiST on
    > points and boxes:
    > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7f3bd86843e5aad84585a57d3f6b80db3c609916
    > I would like to try this picksplit method on ranges. I believe that it
    > might be much more efficient on highly overlapping ranges than current
    > picksplit. Range types patch isn't commited, yet. So, what way of work
    > publishing is prefered in this case? Add-on patch, separate branch in
    > git or something else?
    
    Sounds great! Thank you.
    
    The repo is available here:
    
    http://git.postgresql.org/gitweb/?p=users/jdavis/postgres.git;a=log;h=refs/heads/rangetypes
    
    I'd prefer to include it in the initial patch. If the current GiST code
    is going to be replaced, then there's not much sense reviewing/testing
    it.
    
    You may need to consider unbounded and empty ranges specially. I made an
    attempt to do so in the current GiST code, and you might want to take a
    look at that first. I'm not particularly attached to my approach, but we
    should do something reasonable with unbounded and empty ranges.
    
    Regards,
    	Jeff Davis
    
    
    
  33. Re: Range Types - typo + NULL string constructor

    Jeff Janes <jeff.janes@gmail.com> — 2011-10-08T19:44:52Z

    On Sun, Oct 2, 2011 at 12:05 PM, Jeff Davis <pgsql@j-davis.com> wrote:
    > On Sun, 2011-10-02 at 11:32 +0200, Florian Pflug wrote:
    >> Looking at the patch, I noticed that it's possible to specify the default
    >> boundaries ([], [), (] or ()) per individual float type with the
    >> DEFAULT_FLAGS clause of CREATE TYPE .. AS RANGE. I wonder if that doesn't
    >> do more harm then good - it makes it impossible to deduce the meaning of
    >> e.g. numericrange(1.0, 2.0) without looking up the definition of numericrange.
    >>
    >> I suggest we pick one set of default boundaries, ideally '[)' since that
    >> is what all the built-in canonization functions produce, and stick with it.
    >
    > Done.
    >
    > Also, made the range parsing even more like records with more code
    > copied verbatim. And fixed some parsing tests along the way.
    
    When I apply this to head, "make check" fails with:
    
      create type textrange_en_us as range(subtype=text, collation="en_US");
    + ERROR:  collation "en_US" for encoding "SQL_ASCII" does not exist
    
    Is this a problem?  e.g. will it break the build-farm?
    
    "make check LANG=en_US" does pass
    
    Cheers,
    
    Jeff
    
    
  34. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-09T00:14:21Z

    On Sat, 2011-10-08 at 12:44 -0700, Jeff Janes wrote:
    > When I apply this to head, "make check" fails with:
    > 
    >   create type textrange_en_us as range(subtype=text, collation="en_US");
    > + ERROR:  collation "en_US" for encoding "SQL_ASCII" does not exist
    > 
    > Is this a problem?  e.g. will it break the build-farm?
    > 
    > "make check LANG=en_US" does pass
    
    Thank you for pointing that out. I think I need to remove those before
    commit, but I just wanted them in there now to exercise that part of the
    code.
    
    Is there a better way to test collations like that?
    
    Regards,
    	Jeff Davis
    
    
    
  35. Re: Range Types - typo + NULL string constructor

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-10-09T16:06:34Z

    Jeff Davis <pgsql@j-davis.com> writes:
    > On Sat, 2011-10-08 at 12:44 -0700, Jeff Janes wrote:
    >> When I apply this to head, "make check" fails with:
    >> 
    >> create type textrange_en_us as range(subtype=text, collation="en_US");
    >> + ERROR:  collation "en_US" for encoding "SQL_ASCII" does not exist
    
    > Thank you for pointing that out. I think I need to remove those before
    > commit, but I just wanted them in there now to exercise that part of the
    > code.
    
    > Is there a better way to test collations like that?
    
    You could add some code that assumes particular collations are present
    into collate.linux.utf8.sql.  It's not going to work to depend on
    anything beyond C locale being present in a mainline regression test
    case.
    
    			regards, tom lane
    
    
  36. Re: Range Types - typo + NULL string constructor

    Thom Brown <thom@linux.com> — 2011-10-10T13:27:03Z

    On 2 October 2011 20:05, Jeff Davis <pgsql@j-davis.com> wrote:
    > On Sun, 2011-10-02 at 11:32 +0200, Florian Pflug wrote:
    >> Looking at the patch, I noticed that it's possible to specify the default
    >> boundaries ([], [), (] or ()) per individual float type with the
    >> DEFAULT_FLAGS clause of CREATE TYPE .. AS RANGE. I wonder if that doesn't
    >> do more harm then good - it makes it impossible to deduce the meaning of
    >> e.g. numericrange(1.0, 2.0) without looking up the definition of numericrange.
    >>
    >> I suggest we pick one set of default boundaries, ideally '[)' since that
    >> is what all the built-in canonization functions produce, and stick with it.
    >
    > Done.
    >
    > Also, made the range parsing even more like records with more code
    > copied verbatim. And fixed some parsing tests along the way.
    
    I don't know if this has already been discussed, but can you explain
    the following:
    
    postgres=# select '[1,8]'::int4range;
     int4range
    -----------
     [1,9)
    (1 row)
    
    It seems unintuitive to represent a discrete range using an exclusive
    upper bound.  While I agree that the value itself is correct, it's
    representation looks odd.  Is it necessary?
    
    -- 
    Thom Brown
    Twitter: @darkixion
    IRC (freenode): dark_ixion
    Registered Linux user: #516935
    
    EnterpriseDB UK: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  37. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-10T16:44:51Z

    On Mon, 2011-10-10 at 14:27 +0100, Thom Brown wrote:
    > I don't know if this has already been discussed, but can you explain
    > the following:
    > 
    > postgres=# select '[1,8]'::int4range;
    >  int4range
    > -----------
    >  [1,9)
    > (1 row)
    > 
    > It seems unintuitive to represent a discrete range using an exclusive
    > upper bound.  While I agree that the value itself is correct, it's
    > representation looks odd.  Is it necessary?
    
    The "canonicalize" function (specified at type creation time) allows you
    to specify the canonical output representation. So, I can change the
    canonical form for discrete ranges to use '[]' notation if we think
    that's more expected.
    
    But then "int4range(1,8)" would still mean "int4range(1,8,'[)')" and
    therefore '[1,7]'. I used to have a "default_flags" parameter that could
    also be specified at type creation time that would control the default
    third parameter (the parameter that controls inclusivity) of the
    constructor. However, I removed the "default_flags" parameter because,
    as Florian pointed out, it's better to have a consistent output from the
    constructor.
    
    I'm open to suggestions, including potentially bringing back
    "default_flags".
    
    Regards,
    	Jeff Davis
    
    
    
  38. Re: Range Types - typo + NULL string constructor

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-10-10T16:53:54Z

    Jeff Davis <pgsql@j-davis.com> writes:
    > On Mon, 2011-10-10 at 14:27 +0100, Thom Brown wrote:
    >> I don't know if this has already been discussed, but can you explain
    >> the following:
    >> 
    >> postgres=# select '[1,8]'::int4range;
    >> int4range
    >> -----------
    >> [1,9)
    >> (1 row)
    >> 
    >> It seems unintuitive to represent a discrete range using an exclusive
    >> upper bound.  While I agree that the value itself is correct, it's
    >> representation looks odd.  Is it necessary?
    
    > The "canonicalize" function (specified at type creation time) allows you
    > to specify the canonical output representation. So, I can change the
    > canonical form for discrete ranges to use '[]' notation if we think
    > that's more expected.
    
    What if I write '[1,INT_MAX]'::int4range?  The open-parenthesis form will
    fail with an integer overflow.  I suppose you could canonicalize it to
    an unbounded range, but that seems unnecessarily surprising.
    
    			regards, tom lane
    
    
  39. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-10-10T17:22:50Z

    On Oct10, 2011, at 18:53 , Tom Lane wrote:
    > What if I write '[1,INT_MAX]'::int4range?  The open-parenthesis form will
    > fail with an integer overflow.  I suppose you could canonicalize it to
    > an unbounded range, but that seems unnecessarily surprising.
    
    That is a very good point. Canonicalizing to an unbounded range doesn't work,
    because, as it stands, the ranges '[1, INT_MAX]' and '[1,)' are *not* equal. So
    the only remaining option is to canonicalize to the closed form always.
    
    I still think we should strive for consistency here, so let's also make
    '[]' the default flags for the range constructors.
    
    best regards,
    Florian Pflug
    
    
    
  40. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-10T17:31:09Z

    On Mon, 2011-10-10 at 12:53 -0400, Tom Lane wrote:
    > > The "canonicalize" function (specified at type creation time) allows you
    > > to specify the canonical output representation. So, I can change the
    > > canonical form for discrete ranges to use '[]' notation if we think
    > > that's more expected.
    > 
    > What if I write '[1,INT_MAX]'::int4range?  The open-parenthesis form will
    > fail with an integer overflow.  I suppose you could canonicalize it to
    > an unbounded range, but that seems unnecessarily surprising.
    
    So, are you suggesting that I canonicalize to '[]' then? That seems
    reasonable to me, but there's still some slight awkwardness because
    int4range(1,10) would be '[1,9]'.
    
    Regards,
    	Jeff Davis
    
    
    
  41. Re: Range Types - typo + NULL string constructor

    Thom Brown <thom@linux.com> — 2011-10-10T17:39:20Z

    On 10 October 2011 18:31, Jeff Davis <pgsql@j-davis.com> wrote:
    > On Mon, 2011-10-10 at 12:53 -0400, Tom Lane wrote:
    >> > The "canonicalize" function (specified at type creation time) allows you
    >> > to specify the canonical output representation. So, I can change the
    >> > canonical form for discrete ranges to use '[]' notation if we think
    >> > that's more expected.
    >>
    >> What if I write '[1,INT_MAX]'::int4range?  The open-parenthesis form will
    >> fail with an integer overflow.  I suppose you could canonicalize it to
    >> an unbounded range, but that seems unnecessarily surprising.
    >
    > So, are you suggesting that I canonicalize to '[]' then? That seems
    > reasonable to me, but there's still some slight awkwardness because
    > int4range(1,10) would be '[1,9]'.
    
    Why?  int4range(1,10,'[]') returns:
    
     int4range
    -----------
     [1,11)
    (1 row)
    
    Which if corrected to display the proposed way would just be '[1,10]'.
     So the default boundaries should be '[]' as opposed to '[)' as it is
    now.
    
    -- 
    Thom Brown
    Twitter: @darkixion
    IRC (freenode): dark_ixion
    Registered Linux user: #516935
    
    EnterpriseDB UK: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  42. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-10T17:41:42Z

    On Mon, 2011-10-10 at 19:22 +0200, Florian Pflug wrote:
    > I still think we should strive for consistency here, so let's also make
    > '[]' the default flags for the range constructors.
    
    For continuous ranges I don't think that's a good idea. Closed-open is a
    very widely-accepted convention and there are good reasons for it -- for
    instance, it's good for specifying contiguous-but-non-overlapping
    ranges.
    
    So, I think we either need to standardize on '[)' or allow different
    default_flags for different types. Or, always specify the inclusivity in
    the constructor (hopefully in a convenient way).
    
    Regards,
    	Jeff Davis
    
    
    
  43. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-10T17:45:01Z

    On Mon, 2011-10-10 at 18:39 +0100, Thom Brown wrote:
    >  So the default boundaries should be '[]' as opposed to '[)' as it is
    > now.
    
    Would that vary between range types? In other words, do I bring back
    default_flags?
    
    If not, I think a lot of people will object. The most common use-case
    for range types are for continuous ranges like timestamps. And (as I
    pointed out in reply to Florian) there are good reasons to use the '[)'
    convention for those cases.
    
    Regards,
    	Jeff Davis
    
    
    
    
    
    
  44. Re: Range Types - typo + NULL string constructor

    Thom Brown <thom@linux.com> — 2011-10-10T17:53:34Z

    On 10 October 2011 18:45, Jeff Davis <pgsql@j-davis.com> wrote:
    > On Mon, 2011-10-10 at 18:39 +0100, Thom Brown wrote:
    >>  So the default boundaries should be '[]' as opposed to '[)' as it is
    >> now.
    >
    > Would that vary between range types? In other words, do I bring back
    > default_flags?
    >
    > If not, I think a lot of people will object. The most common use-case
    > for range types are for continuous ranges like timestamps. And (as I
    > pointed out in reply to Florian) there are good reasons to use the '[)'
    > convention for those cases.
    
    I'm proposing it for discrete ranges.  For continuous ranges, I guess
    it makes sense to have "up to, but not including".  The same boundary
    inclusivity/exclusivity thing seems unintuitive for discrete ranges.
    This has the downside of inconsistency, but I don't think that's
    really a solid argument against it since their use will be different
    anyway.
    
    -- 
    Thom Brown
    Twitter: @darkixion
    IRC (freenode): dark_ixion
    Registered Linux user: #516935
    
    EnterpriseDB UK: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  45. Re: Range Types - typo + NULL string constructor

    Thom Brown <thom@linux.com> — 2011-10-10T18:06:16Z

    On 10 October 2011 18:53, Thom Brown <thom@linux.com> wrote:
    > On 10 October 2011 18:45, Jeff Davis <pgsql@j-davis.com> wrote:
    >> On Mon, 2011-10-10 at 18:39 +0100, Thom Brown wrote:
    >>>  So the default boundaries should be '[]' as opposed to '[)' as it is
    >>> now.
    >>
    >> Would that vary between range types? In other words, do I bring back
    >> default_flags?
    >>
    >> If not, I think a lot of people will object. The most common use-case
    >> for range types are for continuous ranges like timestamps. And (as I
    >> pointed out in reply to Florian) there are good reasons to use the '[)'
    >> convention for those cases.
    >
    > I'm proposing it for discrete ranges.  For continuous ranges, I guess
    > it makes sense to have "up to, but not including".  The same boundary
    > inclusivity/exclusivity thing seems unintuitive for discrete ranges.
    > This has the downside of inconsistency, but I don't think that's
    > really a solid argument against it since their use will be different
    > anyway.
    
    Okay, a real example of why discrete should be '[]' and continuous
    should be '[)'.
    
    If you book a meeting from 09:00 to 11:00 (tsrange), at 11:00
    precisely it either becomes free or is available to someone else, so
    it can be booked 11:00 to 12:00 without conflict.
    
    If you have raffle tickets numbered 1 to 100 (int4range), and you ask
    for tickets 9 to 11, no-one else can use 11 as it aligns with the last
    one you bought.
    
    So for me, it's intuitive for them to behave differently.  So yes,
    default behaviour would vary between types, but I didn't previously
    read anything on default_flags, so I don't know where that comes into
    it.  Shouldn't it be the case that if a type has a canonical function,
    it's entirely inclusive, otherwise it's upper boundary is exclusive?
    
    -- 
    Thom Brown
    Twitter: @darkixion
    IRC (freenode): dark_ixion
    Registered Linux user: #516935
    
    EnterpriseDB UK: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  46. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-10-11T01:14:57Z

    On Oct10, 2011, at 20:06 , Thom Brown wrote:
    > Okay, a real example of why discrete should be '[]' and continuous
    > should be '[)'.
    > 
    > If you book a meeting from 09:00 to 11:00 (tsrange), at 11:00
    > precisely it either becomes free or is available to someone else, so
    > it can be booked 11:00 to 12:00 without conflict.
    > 
    > If you have raffle tickets numbered 1 to 100 (int4range), and you ask
    > for tickets 9 to 11, no-one else can use 11 as it aligns with the last
    > one you bought.
    > 
    > So for me, it's intuitive for them to behave differently.  So yes,
    > default behaviour would vary between types, but I didn't previously
    > read anything on default_flags, so I don't know where that comes into
    > it.  Shouldn't it be the case that if a type has a canonical function,
    > it's entirely inclusive, otherwise it's upper boundary is exclusive?
    
    I'm not convinced by this. The question here is not whether which types
    of ranges we *support*, only which type we consider to be more *canonical*,
    and whether the bounds provided to a range constructor are inclusive or
    exclusive by *default*.
    
    Maybe ranges over discrete types are slightly more likely to be closed,
    and ranges over continuous types slightly more likely to be open. Still,
    I very much doubt that the skew in the distribution is large enough to
    warrant the confusion and possibility of subtle bugs we introduce by making
    the semantics of a range type's constructor depend on the definition of the
    range and/or base type. Especially since we're talking about only *6* extra
    characters to communicate the intended inclusivity/exclusivity of the bounds
    to the range constructor.
    
    Also, the distinction between types for which ranges should "obviously"
    be closed, and those for which they should "obviously" be half-open is nowhere
    as clear-cut as it seems at first sight.
    
    First, there's the type "date", which in my book is discrete. So we'd make
    date ranges closed by default, not half-open. And there's timestamp, which
    is continuous so we'd make its default half-open. That doesn't seem exactly
    intuitive to me.
    
    Second, there's int4 and float8, one discrete, one continuous. So would we
    make int4range(1, 2) include 2, but float8range(1.0, 2.0) *not* include 2.0?
    
    best regards,
    Florian Pflug
    
    
    
  47. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-10-11T01:26:01Z

    On Oct10, 2011, at 19:41 , Jeff Davis wrote:
    > On Mon, 2011-10-10 at 19:22 +0200, Florian Pflug wrote:
    >> I still think we should strive for consistency here, so let's also make
    >> '[]' the default flags for the range constructors.
    > 
    > For continuous ranges I don't think that's a good idea. Closed-open is a
    > very widely-accepted convention and there are good reasons for it -- for
    > instance, it's good for specifying contiguous-but-non-overlapping
    > ranges.
    
    It really depends on what you're using ranges for. Yeah, if you're "convering"
    something with ranges (like mapping things to a certain period of time, or
    an area of space), then half-open ranges are probably very common.
    
    If, OTOH, you're storing measurement with error margins, then open ranges,
    i.e. '()', are probably what you want.
    
    I still firmly believe that consistency trumps convenience here. Specifying
    the range boundaries' exclusivity/inclusivity explicitly is quite cheap...
    
    > So, I think we either need to standardize on '[)' or allow different
    > default_flags for different types. Or, always specify the inclusivity in
    > the constructor (hopefully in a convenient way).
    
    In the light of Tom's argument, my pick would be '[]'. It's seem strange
    to normalize ranges over discrete types to closed ranges, yet make the
    construction function expect open boundaries by default.
    
    best regards,
    Florian Pflug
    
    
    
  48. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-11T05:25:18Z

    On Tue, 2011-10-11 at 03:14 +0200, Florian Pflug wrote:
    > Maybe ranges over discrete types are slightly more likely to be closed,
    > and ranges over continuous types slightly more likely to be open. Still,
    > I very much doubt that the skew in the distribution is large enough to
    > warrant the confusion and possibility of subtle bugs we introduce by making
    > the semantics of a range type's constructor depend on the definition of the
    > range and/or base type.
    
    I think you persuaded me on the consistency aspect.
    
    I'm wondering whether to do away with the default argument entirely, and
    just force the user to always specify it during construction. It seems
    like a shame that such pain is caused over the syntax, because in a
    perfect world it wouldn't be a bother to specify it at all. I even
    considered using prefix/postfix operators to try to make it nicer, but
    it seems like every idea I had was just short of practical. Maybe a few
    extra characters at the end aren't so bad.
    
    I'd like to hear from some potential users though to see if anyone
    recoils at the common case.
    
    Regards,
    	Jeff Davis
    
    
    
  49. Re: Range Types - typo + NULL string constructor

    Thom Brown <thom@linux.com> — 2011-10-11T07:25:47Z

    On 11 October 2011 02:14, Florian Pflug <fgp@phlo.org> wrote:
    > On Oct10, 2011, at 20:06 , Thom Brown wrote:
    >> Okay, a real example of why discrete should be '[]' and continuous
    >> should be '[)'.
    >>
    >> If you book a meeting from 09:00 to 11:00 (tsrange), at 11:00
    >> precisely it either becomes free or is available to someone else, so
    >> it can be booked 11:00 to 12:00 without conflict.
    >>
    >> If you have raffle tickets numbered 1 to 100 (int4range), and you ask
    >> for tickets 9 to 11, no-one else can use 11 as it aligns with the last
    >> one you bought.
    >>
    >> So for me, it's intuitive for them to behave differently.  So yes,
    >> default behaviour would vary between types, but I didn't previously
    >> read anything on default_flags, so I don't know where that comes into
    >> it.  Shouldn't it be the case that if a type has a canonical function,
    >> it's entirely inclusive, otherwise it's upper boundary is exclusive?
    >
    > First, there's the type "date", which in my book is discrete. So we'd make
    > date ranges closed by default, not half-open. And there's timestamp, which
    > is continuous so we'd make its default half-open. That doesn't seem exactly
    > intuitive to me.
    
    Ah yes, I agree there.  Okay, I see your point.
    
    -- 
    Thom Brown
    Twitter: @darkixion
    IRC (freenode): dark_ixion
    Registered Linux user: #516935
    
    EnterpriseDB UK: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  50. Re: Range Types - typo + NULL string constructor

    David Fetter <david@fetter.org> — 2011-10-11T12:43:07Z

    On Mon, Oct 10, 2011 at 10:25:18PM -0700, Jeff Davis wrote:
    > On Tue, 2011-10-11 at 03:14 +0200, Florian Pflug wrote:
    > > Maybe ranges over discrete types are slightly more likely to be
    > > closed, and ranges over continuous types slightly more likely to
    > > be open. Still, I very much doubt that the skew in the
    > > distribution is large enough to warrant the confusion and
    > > possibility of subtle bugs we introduce by making the semantics of
    > > a range type's constructor depend on the definition of the range
    > > and/or base type.
    > 
    > I think you persuaded me on the consistency aspect.
    > 
    > I'm wondering whether to do away with the default argument entirely,
    > and just force the user to always specify it during construction. It
    > seems like a shame that such pain is caused over the syntax, because
    > in a perfect world it wouldn't be a bother to specify it at all. I
    > even considered using prefix/postfix operators to try to make it
    > nicer, but it seems like every idea I had was just short of
    > practical. Maybe a few extra characters at the end aren't so bad.
    > 
    > I'd like to hear from some potential users though to see if anyone
    > recoils at the common case.
    
    I'd recoil at not having ranges default to left-closed, right-open.
    The use case for that one is so compelling that I'm OK with making it
    the default from which deviations need to be specified.
    
    Cheers,
    David.
    -- 
    David Fetter <david@fetter.org> http://fetter.org/
    Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
    Skype: davidfetter      XMPP: david.fetter@gmail.com
    iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
    
    Remember to vote!
    Consider donating to Postgres: http://www.postgresql.org/about/donate
    
    
  51. Re: Range Types - typo + NULL string constructor

    Florian G. Pflug <fgp@phlo.org> — 2011-10-11T13:20:05Z

    On Oct11, 2011, at 14:43 , David Fetter wrote:
    > I'd recoil at not having ranges default to left-closed, right-open.
    > The use case for that one is so compelling that I'm OK with making it
    > the default from which deviations need to be specified.
    
    The downside of that is that, as Tom pointed out upthread, we cannot
    make [) the canonical representation of ranges. It'd require us to
    increment the right boundary of a closed range, but that incremented
    boundary might no longer be in the base type's domain.
    
    So we'd end up with [) being the default for range construction,
    but [] being the canonical representation, i.e. what you get back
    when SELECTing a range (over a discrete base type).
    
    Certainly not the end of the world, but is the convenience of being
    able to write somerange(a, b) instead of somerange(a, b, '[)') really
    worth it? I kind of doubt that...
    
    best regards,
    Florian Pflug
    
    
    
  52. Re: Range Types - typo + NULL string constructor

    David Fetter <david@fetter.org> — 2011-10-11T13:28:30Z

    On Tue, Oct 11, 2011 at 03:20:05PM +0200, Florian Pflug wrote:
    > On Oct11, 2011, at 14:43 , David Fetter wrote:
    > > I'd recoil at not having ranges default to left-closed,
    > > right-open.  The use case for that one is so compelling that I'm
    > > OK with making it the default from which deviations need to be
    > > specified.
    > 
    > The downside of that is that, as Tom pointed out upthread, we cannot
    > make [) the canonical representation of ranges. It'd require us to
    > increment the right boundary of a closed range, but that incremented
    > boundary might no longer be in the base type's domain.
    > 
    > So we'd end up with [) being the default for range construction, but
    > [] being the canonical representation, i.e. what you get back when
    > SELECTing a range (over a discrete base type).
    > 
    > Certainly not the end of the world, but is the convenience of being
    > able to write somerange(a, b) instead of somerange(a, b, '[)')
    > really worth it? I kind of doubt that...
    
    You're making a persuasive argument for the latter based solely on the
    clarity.  If people see that 3rd element in the DDL, or need to
    provide it, it's *very* obvious what's going on.
    
    Cheers,
    David (who suspects that having a syntax like somerange[a,b) just
    won't work with the current state of parsers, etc.)
    -- 
    David Fetter <david@fetter.org> http://fetter.org/
    Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
    Skype: davidfetter      XMPP: david.fetter@gmail.com
    iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
    
    Remember to vote!
    Consider donating to Postgres: http://www.postgresql.org/about/donate
    
    
  53. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-11T16:03:12Z

    On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote:
    > > Certainly not the end of the world, but is the convenience of being
    > > able to write somerange(a, b) instead of somerange(a, b, '[)')
    > > really worth it? I kind of doubt that...
    > 
    > You're making a persuasive argument for the latter based solely on the
    > clarity.  If people see that 3rd element in the DDL, or need to
    > provide it, it's *very* obvious what's going on.
    
    That was how I originally thought, but we're also providing built-in
    range types like tsrange and daterange. I could see how if the former
    excluded the endpoint and the latter included it, it could be confusing.
    
    We could go back to having different constructor names for different
    inclusivity; e.g. int4range_cc(1,10). That at least removes the
    awkwardness of typing (and seeing) '[]'.
    
    Regards,
    	Jeff Davis
    
    
    
  54. Re: Range Types - typo + NULL string constructor

    Robert Haas <robertmhaas@gmail.com> — 2011-10-11T16:09:01Z

    On Tue, Oct 11, 2011 at 12:03 PM, Jeff Davis <pgsql@j-davis.com> wrote:
    > On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote:
    >> > Certainly not the end of the world, but is the convenience of being
    >> > able to write somerange(a, b) instead of somerange(a, b, '[)')
    >> > really worth it? I kind of doubt that...
    >>
    >> You're making a persuasive argument for the latter based solely on the
    >> clarity.  If people see that 3rd element in the DDL, or need to
    >> provide it, it's *very* obvious what's going on.
    >
    > That was how I originally thought, but we're also providing built-in
    > range types like tsrange and daterange. I could see how if the former
    > excluded the endpoint and the latter included it, it could be confusing.
    >
    > We could go back to having different constructor names for different
    > inclusivity; e.g. int4range_cc(1,10). That at least removes the
    > awkwardness of typing (and seeing) '[]'.
    
    The cure seems worse than the disease.  What is so bad about '[]'?
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  55. Re: Range Types - typo + NULL string constructor

    David Fetter <david@fetter.org> — 2011-10-11T16:12:10Z

    On Tue, Oct 11, 2011 at 12:09:01PM -0400, Robert Haas wrote:
    > On Tue, Oct 11, 2011 at 12:03 PM, Jeff Davis <pgsql@j-davis.com> wrote:
    > > On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote:
    > >> > Certainly not the end of the world, but is the convenience of being
    > >> > able to write somerange(a, b) instead of somerange(a, b, '[)')
    > >> > really worth it? I kind of doubt that...
    > >>
    > >> You're making a persuasive argument for the latter based solely on the
    > >> clarity.  If people see that 3rd element in the DDL, or need to
    > >> provide it, it's *very* obvious what's going on.
    > >
    > > That was how I originally thought, but we're also providing built-in
    > > range types like tsrange and daterange. I could see how if the former
    > > excluded the endpoint and the latter included it, it could be confusing.
    > >
    > > We could go back to having different constructor names for different
    > > inclusivity; e.g. int4range_cc(1,10). That at least removes the
    > > awkwardness of typing (and seeing) '[]'.
    > 
    > The cure seems worse than the disease.  What is so bad about '[]'?
    
    Nothing's bad about '[]' per se.  What's better, but possibly out of
    the reach of our current lexing and parsing system, would be things
    like:
    
    [1::int, 10)
    
    Cheers,
    David.
    -- 
    David Fetter <david@fetter.org> http://fetter.org/
    Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
    Skype: davidfetter      XMPP: david.fetter@gmail.com
    iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
    
    Remember to vote!
    Consider donating to Postgres: http://www.postgresql.org/about/donate
    
    
  56. Re: Range Types - typo + NULL string constructor

    Robert Haas <robertmhaas@gmail.com> — 2011-10-11T16:18:18Z

    On Tue, Oct 11, 2011 at 12:12 PM, David Fetter <david@fetter.org> wrote:
    > Nothing's bad about '[]' per se.  What's better, but possibly out of
    > the reach of our current lexing and parsing system, would be things
    > like:
    >
    > [1::int, 10)
    
    That's been discussed before.  Aside from the parser issues (which are
    formidable) it would break brace-matching in most if not all commonly
    used editors.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  57. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-11T16:30:02Z

    On Tue, 2011-10-11 at 12:09 -0400, Robert Haas wrote:
    > The cure seems worse than the disease.  What is so bad about '[]'?
    
    OK, so we stick with the 3-argument form. Do we have a default for the
    third argument, or do we scrap it to avoid confusion?
    
    There were some fairly strong objections to using '[]' as the default or
    having the default vary between types. So, the only real option
    remaining, if we do have a default, is '[)'.
    
    Regards,
    	Jeff Davis
    
    
    
    
  58. Re: Range Types - typo + NULL string constructor

    Robert Haas <robertmhaas@gmail.com> — 2011-10-11T16:40:32Z

    On Tue, Oct 11, 2011 at 12:30 PM, Jeff Davis <pgsql@j-davis.com> wrote:
    > On Tue, 2011-10-11 at 12:09 -0400, Robert Haas wrote:
    >> The cure seems worse than the disease.  What is so bad about '[]'?
    >
    > OK, so we stick with the 3-argument form. Do we have a default for the
    > third argument, or do we scrap it to avoid confusion?
    >
    > There were some fairly strong objections to using '[]' as the default or
    > having the default vary between types. So, the only real option
    > remaining, if we do have a default, is '[)'.
    
    I think using '[)' is fine.  At some level, this is just a question of
    expectations.  If you expect that int4range(1,4) will create a range
    that includes 4, well, you're wrong.  Once you get used to it, it will
    seem normal, and you'll know that you need to write
    int4range(1,4,'[]') if that's what you want.  As long as the system is
    designed around a set of consistent and well-thought-out principles,
    people will get used to the details.  I don't see that the idea of a
    half-open range over a discrete-valued type is particularly confusing
    - we use them all the time in programming, when we make the end
    pointer point to the byte following the end of the array, rather than
    the last element - but even if it is, overall design consistency
    trumps what someone may find to be the absolutely perfect behavior in
    some particular case.  And saving typing is nearly always good -
    unless it creates a LOT more confusion than I think this will.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  59. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-11T17:48:41Z

    On Tue, 2011-10-11 at 12:40 -0400, Robert Haas wrote:
    > I think using '[)' is fine.  At some level, this is just a question of
    > expectations.  If you expect that int4range(1,4) will create a range
    > that includes 4, well, you're wrong.  Once you get used to it, it will
    > seem normal, and you'll know that you need to write
    > int4range(1,4,'[]') if that's what you want.  As long as the system is
    > designed around a set of consistent and well-thought-out principles,
    > people will get used to the details.  I don't see that the idea of a
    > half-open range over a discrete-valued type is particularly confusing
    > - we use them all the time in programming, when we make the end
    > pointer point to the byte following the end of the array, rather than
    > the last element - but even if it is, overall design consistency
    > trumps what someone may find to be the absolutely perfect behavior in
    > some particular case.  And saving typing is nearly always good -
    > unless it creates a LOT more confusion than I think this will.
    
    That sounds very reasonable to me.
    
    Tom made an observation about '[1,INT_MAX]' thowing an error because
    canonicalization would try to increment INT_MAX. But I'm not
    particularly disturbed by it. If you want a bigger range, use int8range
    or numrange -- the same advice we give to people who want unsigned
    types. Or, for people who really need the entire range of signed int4
    exactly, they can easily make their own range type that canonicalizes to
    '[]'.
    
    Regards,
    	Jeff Davis
    
    
    
  60. Re: Range Types - typo + NULL string constructor

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-10-11T17:51:46Z

    Jeff Davis <pgsql@j-davis.com> writes:
    > Tom made an observation about '[1,INT_MAX]' thowing an error because
    > canonicalization would try to increment INT_MAX. But I'm not
    > particularly disturbed by it. If you want a bigger range, use int8range
    > or numrange -- the same advice we give to people who want unsigned
    > types. Or, for people who really need the entire range of signed int4
    > exactly, they can easily make their own range type that canonicalizes to
    > '[]'.
    
    I agree we shouldn't contort the entire design to avoid that corner
    case.  We should, however, make sure that the increment *does* throw
    an error, and not just silently overflow.
    
    			regards, tom lane
    
    
  61. Re: Range Types - typo + NULL string constructor

    David Fetter <david@fetter.org> — 2011-10-11T18:04:07Z

    On Tue, Oct 11, 2011 at 12:18:18PM -0400, Robert Haas wrote:
    > On Tue, Oct 11, 2011 at 12:12 PM, David Fetter <david@fetter.org> wrote:
    > > Nothing's bad about '[]' per se.  What's better, but possibly out
    > > of the reach of our current lexing and parsing system, would be
    > > things like:
    > >
    > > [1::int, 10)
    > 
    > That's been discussed before.  Aside from the parser issues (which
    > are formidable) it would break brace-matching in most if not all
    > commonly used editors.
    
    That being the situation, ubiquitous support for the natural syntax
    looks like it's a decade away, minimum. :(
    
    Trying to be cheery,
    David.
    -- 
    David Fetter <david@fetter.org> http://fetter.org/
    Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
    Skype: davidfetter      XMPP: david.fetter@gmail.com
    iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
    
    Remember to vote!
    Consider donating to Postgres: http://www.postgresql.org/about/donate
    
    
  62. Re: Range Types - typo + NULL string constructor

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-10-11T19:57:26Z

    Florian Pflug <fgp@phlo.org> writes:
    > On Oct11, 2011, at 14:43 , David Fetter wrote:
    >> I'd recoil at not having ranges default to left-closed, right-open.
    >> The use case for that one is so compelling that I'm OK with making it
    >> the default from which deviations need to be specified.
    
    I agree with David on this.
    
    > The downside of that is that, as Tom pointed out upthread, we cannot
    > make [) the canonical representation of ranges.
    
    Yeah, we certainly *can* do that, we just have to allow ranges that
    include the last element of the domain to be corner cases that require
    special handling.  If we don't want to just fail, we have to
    canonicalize them to closed instead of open ranges.  It does not follow
    that the default on input has to be closed.
    
    Note that the INT_MAX case is probably not the worst issue in practice.
    What is going to be an issue is ranges over enum types, where having the
    last element being part of the range is a much more likely use-case.
    
    			regards, tom lane
    
    
  63. Re: Range Types - typo + NULL string constructor

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-10-14T22:46:26Z

    On 07.10.2011 06:41, Jeff Davis wrote:
    > The repo is available here:
    >
    > http://git.postgresql.org/gitweb/?p=users/jdavis/postgres.git;a=log;h=refs/heads/rangetypes
    
    I took a look at this, fixed a bunch of minor issues, and pushed to my 
    git repo: git://git.postgresql.org/git/users/heikki/postgres.git. 
    Attached is an updated patch including those changes, for the archives.
    
    A few issues that I didn't fix straight away:
    
    * Why do you need to be a superuser to create a range type? In 
    DefineRange, the comment explaining why superuser privilege is required 
    seems bogus, copy-pasted from base types. So what is the reason?
    
    * The binary i/o format includes the length of the lower and upper 
    bounds twice, once explicitly in range_send, and second time within the 
    send-function of the subtype. Seems wasteful.
    
    * In findRangeSubOpclass, I think it's OK to hard-code "btree" into the 
    error message, no need to look that up from pg_am.
    
    * Do we really need non_empty(anyrange) ? You can just do "NOT empty(x)"
    
    * Range_before and range_after: I think "input range is empty" would 
    sound better than "undefined for empty ranges".
    
    * In range_hash, rotating the hash of the initial flags byte seems pointless
    
    * range_constructor_internal - I think it would be better to move logic 
    to figure out the the arguments into the callers.
    
    * The gist support functions frequently call range_deserialize(), which 
    does catalog lookups. Isn't that horrendously expensive?
    
    * In range_serialize and range_deserialize, reduce the number of catalog 
    lookups by combining the get_range_subtype, get_typlen, get_typalign, 
    get_typbyval and get_typstorage calls
    with just one catalog lookup, and fetch all those fields at once. 
    Something like get_typlenbyvalalign()
    
    * Have you tested this on an architecture with strict alignment? I don't 
    see any alignment bugs, but I think there's a lot of potential for them..
    
    Documentation
    -------------
    * Section "Built-in Range Types". How about "PostgreSQL comes with the 
    following built-in range types:" <examples> "In addition, you can define 
    your own"
    
    * In section "Inclusive and Exclusive Bounds", it is said "An inclusive 
    lower bound is represented by ...". That begs the question: where is it 
    represented like that? That doesn't become clear until you read the 
    section on Input/Output format. Reordering the sections might help. Are 
    the double-quotes around [ and ( necessary?
    
    * It would be good to have some examples on the input formats in the 
    Input/Output section. There's one in the Examples section, but it's not 
    very clear that it's related. Perhaps the whole Examples section should 
    be moved to the end, or the examples be dispersed into the other sections.
    
    * Likewise it would be nice to have some examples in the Inclusive and 
    Exclusive Bounds section.
    
    * subtype_float, how is it supposed to work? I couldn't find any 
    explanation in the docs. Can we come up with a better name for the function?
    
    * Constructing Ranges section: should describe the 0-3 argument forms of 
    the constructors, not only in the examples but in the main text too.
    
    * What exactly is canonical function supposed to return? It's not clear 
    what format one should choose as the canonical format when writing a 
    custom range type. And even for the built-in types, it would be good to 
    explain which types use which canonical format (I saw the discussions on 
    that, so I understand that might still be subject to change).
    
    * Defining New RangeTypes section: A more general description in 
    addition to the example would be good.
    
    * "GiST Indexing" title: how about making it just "Indexing". Also, it 
    would be nice to list exactly which operators can be sped up by gist.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
  64. Re: Range Types - typo + NULL string constructor

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-10-15T01:58:29Z

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
    > * Have you tested this on an architecture with strict alignment? I don't 
    > see any alignment bugs, but I think there's a lot of potential for them..
    
    Well, fwiw, this patch passes its regression tests on HPPA, except for this
    which seems more to do with the already-noted unacceptable dependency on
    non-C collations:
    
    *** /home/postgres/pgsql/src/test/regress/expected/rangetypes.out	Fri Oct 14 21:19:19 2011
    --- /home/postgres/pgsql/src/test/regress/results/rangetypes.out	Fri Oct 14 21:50:11 2011
    ***************
    *** 842,857 ****
      --
      create type textrange_c as range(subtype=text, collation="C");
      create type textrange_en_us as range(subtype=text, collation="en_US");
      select textrange_c('a','Z') @> 'b'::text;
      ERROR:  range lower bound must be less than or equal to range upper bound
      select textrange_en_us('a','Z') @> 'b'::text;
    !  ?column? 
    ! ----------
    !  t
    ! (1 row)
    ! 
      drop type textrange_c;
      drop type textrange_en_us;
      --
      -- Test out polymorphic type system
      --
    --- 842,858 ----
      --
      create type textrange_c as range(subtype=text, collation="C");
      create type textrange_en_us as range(subtype=text, collation="en_US");
    + ERROR:  collation "en_US" for encoding "SQL_ASCII" does not exist
      select textrange_c('a','Z') @> 'b'::text;
      ERROR:  range lower bound must be less than or equal to range upper bound
      select textrange_en_us('a','Z') @> 'b'::text;
    ! ERROR:  function textrange_en_us(unknown, unknown) does not exist
    ! LINE 1: select textrange_en_us('a','Z') @> 'b'::text;
    !                ^
    ! HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
      drop type textrange_c;
      drop type textrange_en_us;
    + ERROR:  type "textrange_en_us" does not exist
      --
      -- Test out polymorphic type system
      --
    
    ======================================================================
    
    
    Also, I notice you forgot to update serial_schedule:
    
    diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
    index bb654f9c612970ef777e8cc39369a91e343f6afc..2e87d9eefd6fbb70a5603bc000ffe833
    5945201f 100644
    *** a/src/test/regress/serial_schedule
    --- b/src/test/regress/serial_schedule
    *************** test: txid
    *** 18,23 ****
    --- 18,24 ----
      test: uuid
      test: enum
      test: money
    + test: rangetypes
      test: strings
      test: numerology
      test: point
    
    
    			regards, tom lane
    
    
  65. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-16T22:09:13Z

    On Sat, 2011-10-15 at 01:46 +0300, Heikki Linnakangas wrote:
    > On 07.10.2011 06:41, Jeff Davis wrote:
    > > The repo is available here:
    > >
    > > http://git.postgresql.org/gitweb/?p=users/jdavis/postgres.git;a=log;h=refs/heads/rangetypes
    > 
    > I took a look at this, fixed a bunch of minor issues, and pushed to my 
    > git repo: git://git.postgresql.org/git/users/heikki/postgres.git. 
    > Attached is an updated patch including those changes, for the archives.
    
    Thank you for the review!
    
    Updated patch attached (and pushed to repo).
    
    > A few issues that I didn't fix straight away:
    > 
    > * Why do you need to be a superuser to create a range type? In 
    > DefineRange, the comment explaining why superuser privilege is required 
    > seems bogus, copy-pasted from base types. So what is the reason?
    
    No particular reason. I thought someone might ask me to include such as
    restriction. I have removed it to allow non-superuser creation.
    
    > * The binary i/o format includes the length of the lower and upper 
    > bounds twice, once explicitly in range_send, and second time within the 
    > send-function of the subtype. Seems wasteful.
    
    Any ideas how to fix that? How else do I know how much the underlying
    send function will consume?
    
    > * In findRangeSubOpclass, I think it's OK to hard-code "btree" into the 
    > error message, no need to look that up from pg_am.
    
    Done.
    
    > * Do we really need non_empty(anyrange) ? You can just do "NOT empty(x)"
    
    To make it a searchable (via GiST) condition, I need an operator. I
    could either remove that operator (as it's not amazingly useful), or I
    could just not document the function but leave the operator there.
    
    > * Range_before and range_after: I think "input range is empty" would 
    > sound better than "undefined for empty ranges".
    
    Done.
    
    > * In range_hash, rotating the hash of the initial flags byte seems pointless
    
    Fixed.
    
    > * range_constructor_internal - I think it would be better to move logic 
    > to figure out the the arguments into the callers.
    
    Done.
    
    > * The gist support functions frequently call range_deserialize(), which 
    > does catalog lookups. Isn't that horrendously expensive?
    
    Yes, it was. I have introduced a cached structure that avoids syscache
    lookups when it's the same range as the last lookup (the common case).
    
    > * In range_serialize and range_deserialize, reduce the number of catalog 
    > lookups by combining the get_range_subtype, get_typlen, get_typalign, 
    > get_typbyval and get_typstorage calls
    > with just one catalog lookup, and fetch all those fields at once. 
    > Something like get_typlenbyvalalign()
    
    Done in conjunction with the cache structure I mentioned above.
    
    > * Have you tested this on an architecture with strict alignment? I don't 
    > see any alignment bugs, but I think there's a lot of potential for them..
    
    I don't have such an architecture readily available.
    
    > Documentation
    > -------------
    > * Section "Built-in Range Types". How about "PostgreSQL comes with the 
    > following built-in range types:" <examples> "In addition, you can define 
    > your own"
    
    Done.
    
    > * In section "Inclusive and Exclusive Bounds", it is said "An inclusive 
    > lower bound is represented by ...". That begs the question: where is it 
    > represented like that? That doesn't become clear until you read the 
    > section on Input/Output format. Reordering the sections might help. Are 
    > the double-quotes around [ and ( necessary?
    
    I added a forward reference. Rearranging the sections seemed to create
    the opposite problem. And I removed the extra double-quotes.
    
    > * It would be good to have some examples on the input formats in the 
    > Input/Output section. There's one in the Examples section, but it's not 
    > very clear that it's related. Perhaps the whole Examples section should 
    > be moved to the end, or the examples be dispersed into the other sections.
    
    Done.
    
    > * Likewise it would be nice to have some examples in the Inclusive and 
    > Exclusive Bounds section.
    
    I think the new examples in the I/O section cover that, and there's a
    forward link from the inclusivity/exclusivity section.
    
    > * subtype_float, how is it supposed to work? I couldn't find any 
    > explanation in the docs. Can we come up with a better name for the function?
    
    It's only for the gist penalty function. It's difficult to know how to
    penalize without being able to do some math on the boundaries, so
    subtype_float it meant to return a float that might be useful for that
    purpose.
    
    It's likely that this will change based on Alexander Korotkov's comments
    here:
    
    http://archives.postgresql.org/message-id/CAPpHfdsVow
    +Yeoq5Be=sxPFPsbYG3Gj3QyVvVdt8dg7XqNyHpg@mail.gmail.com
    
    So I will defer the documentation updates until we work that out.
    
    > * Constructing Ranges section: should describe the 0-3 argument forms of 
    > the constructors, not only in the examples but in the main text too.
    
    Done.
    
    > * What exactly is canonical function supposed to return? It's not clear 
    > what format one should choose as the canonical format when writing a 
    > custom range type. And even for the built-in types, it would be good to 
    > explain which types use which canonical format (I saw the discussions on 
    > that, so I understand that might still be subject to change).
    
    The canonical function is just supposed to return a new range such that
    two equal values will have equal representations. I have listed the
    built-in discrete range types and their canonical form.
    
    As far as describing what a custom range type is supposed to use for the
    canonical form, I don't know which part is exactly unclear. There aren't
    too many rules to defining one -- the only guideline is that ranges of
    equal value going in should be put in one canonical representation.
    
    > * Defining New RangeTypes section: A more general description in 
    > addition to the example would be good.
    
    Done.
    
    > * "GiST Indexing" title: how about making it just "Indexing". Also, it 
    > would be nice to list exactly which operators can be sped up by gist.
    
    Done.
    
    Regards,
    	Jeff Davis
    
    
  66. Re: Range Types - typo + NULL string constructor

    Erik Rijkers <er@xs4all.nl> — 2011-10-17T14:07:23Z

    On Mon, October 17, 2011 00:09, Jeff Davis wrote:
    
    To facilitate would-be testers let me mention that to apply this patch one needs to remove the
    catversion changes from the patch.
    
    (as discussed before: http://archives.postgresql.org/pgsql-hackers/2011-02/msg00817.php )
    
    
    Erik Rijkers
    
    
    
    
    
  67. Re: Range Types - typo + NULL string constructor

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-10-24T10:15:03Z

    On 17.10.2011 01:09, Jeff Davis wrote:
    > On Sat, 2011-10-15 at 01:46 +0300, Heikki Linnakangas wrote:
    >> * The binary i/o format includes the length of the lower and upper
    >> bounds twice, once explicitly in range_send, and second time within the
    >> send-function of the subtype. Seems wasteful.
    >
    > Any ideas how to fix that? How else do I know how much the underlying
    > send function will consume?
    
    Oh, never mind. I was misreading the code, it's not sending the length 
    twice.
    
    >> * range_constructor_internal - I think it would be better to move logic
    >> to figure out the the arguments into the callers.
    >
    > Done.
    
    The comment above range_constructor0() is now outdated.
    
    >> * The gist support functions frequently call range_deserialize(), which
    >> does catalog lookups. Isn't that horrendously expensive?
    >
    > Yes, it was. I have introduced a cached structure that avoids syscache
    > lookups when it's the same range as the last lookup (the common case).
    
    Hmm, I don't think that's safe. After Oid wraparound, a range type oid 
    might get reused for some other range type, and the cache would return 
    stale values. Extremely unlikely to happen by accident, but could be 
    exploited by an attacker.
    
    >> * What exactly is canonical function supposed to return? It's not clear
    >> what format one should choose as the canonical format when writing a
    >> custom range type. And even for the built-in types, it would be good to
    >> explain which types use which canonical format (I saw the discussions on
    >> that, so I understand that might still be subject to change).
    >
    > The canonical function is just supposed to return a new range such that
    > two equal values will have equal representations. I have listed the
    > built-in discrete range types and their canonical form.
    >
    > As far as describing what a custom range type is supposed to use for the
    > canonical form, I don't know which part is exactly unclear. There aren't
    > too many rules to defining one -- the only guideline is that ranges of
    > equal value going in should be put in one canonical representation.
    
    Ok. The name "canonical" certainly hints at that, but it would be good 
    to explicitly state that guideline. As the text stands, it would seem 
    that a canonical function that maps "[1,7]" to "[1,8)", and also vice 
    versa, "[1,8)" to "[1,7]", would be valid. That would be pretty silly, 
    but it would be good to say something like "The canonical output for two 
    values that are equal, like [1,7] and [1,8), must be equal. It doesn't 
    matter which representation you choose to be the canonical one, as long 
    as two equal values with different formattings are always mapped to the 
    same value with same formatting"
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  68. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-25T16:37:12Z

    On Mon, 2011-10-24 at 13:15 +0300, Heikki Linnakangas wrote:
    > Hmm, I don't think that's safe. After Oid wraparound, a range type oid 
    > might get reused for some other range type, and the cache would return 
    > stale values. Extremely unlikely to happen by accident, but could be 
    > exploited by an attacker.
    > 
    
    Any ideas on how to remedy that? I don't have another plan for making it
    perform well. Plugging it into the cache invalidation mechanism seems
    like overkill, but I suppose that would solve the problem.
    
    Aren't there a few other cases like this floating around the code? I
    know the single-xid cache is potentially vulnerable to xid wraparound
    for the same reason.
    
    Regards,
    	Jeff Davis
    
    
    
  69. Re: Range Types - typo + NULL string constructor

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-10-26T06:53:46Z

    On 25.10.2011 19:37, Jeff Davis wrote:
    > On Mon, 2011-10-24 at 13:15 +0300, Heikki Linnakangas wrote:
    >> Hmm, I don't think that's safe. After Oid wraparound, a range type oid
    >> might get reused for some other range type, and the cache would return
    >> stale values. Extremely unlikely to happen by accident, but could be
    >> exploited by an attacker.
    >
    > Any ideas on how to remedy that? I don't have another plan for making it
    > perform well. Plugging it into the cache invalidation mechanism seems
    > like overkill, but I suppose that would solve the problem.
    
    I think we should look at the array-functions for precedent. array_in et 
    al cache the information in fn_extra, so that when it's called 
    repeatedly in one statement for the same type, the information is only 
    looked up once. That's good enough, it covers repeated execution in a 
    single query, as well as COPY and comparison calls from index searches, 
    for example.
    
    > Aren't there a few other cases like this floating around the code?
    
    Not that I know of. That said, I wouldn't be too surprised if there was.
    
    > I know the single-xid cache is potentially vulnerable to xid wraparound
    > for the same reason.
    
    True.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  70. Re: Range Types - typo + NULL string constructor

    Robert Haas <robertmhaas@gmail.com> — 2011-10-26T15:42:52Z

    On Tue, Oct 25, 2011 at 12:37 PM, Jeff Davis <pgsql@j-davis.com> wrote:
    > On Mon, 2011-10-24 at 13:15 +0300, Heikki Linnakangas wrote:
    >> Hmm, I don't think that's safe. After Oid wraparound, a range type oid
    >> might get reused for some other range type, and the cache would return
    >> stale values. Extremely unlikely to happen by accident, but could be
    >> exploited by an attacker.
    >
    > Any ideas on how to remedy that? I don't have another plan for making it
    > perform well. Plugging it into the cache invalidation mechanism seems
    > like overkill, but I suppose that would solve the problem.
    >
    > Aren't there a few other cases like this floating around the code? I
    > know the single-xid cache is potentially vulnerable to xid wraparound
    > for the same reason.
    
    I believe that we're in trouble with XIDs as soon as you have two
    active XIDs that are separated by a billion, because then you could
    have a situation where some people think a given XID is in the future
    and others think it's in the past.  I have been wondering if we should
    have some sort of active guard against that scenario; I don't think we
    do at present.
    
    But OID wraparound is not the same as XID wraparound.  It's far more
    common, I think, for a single transaction to use lots of OIDs than it
    is for it to use lots of XIDs (i.e. have many subtransactions).
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  71. Re: Range Types - typo + NULL string constructor

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-10-26T15:52:28Z

    On 26.10.2011 18:42, Robert Haas wrote:
    > On Tue, Oct 25, 2011 at 12:37 PM, Jeff Davis<pgsql@j-davis.com>  wrote:
    >> Aren't there a few other cases like this floating around the code? I
    >> know the single-xid cache is potentially vulnerable to xid wraparound
    >> for the same reason.
    >
    > I believe that we're in trouble with XIDs as soon as you have two
    > active XIDs that are separated by a billion, ...
    
    That's not what Jeff is referring to here, though (correct me if I'm 
    wrong). He's talking about the one-item cache in 
    TransactionIdLogFetch(). You don't need need long-running transactions 
    for that to get confused. Specifically, this could happen:
    
    1. In session A: BEGIN; SELECT * FROM foo WHERE id = 1; COMMIT;
        The row has xmin = 123456, and it is cached as committed in the 
    one-item cache by TransactionLogFetch.
    2. A lot of time passes. Everything is frozen, and XID wrap-around 
    happens. (Session A is idle but not in a transaction, so it doesn't 
    inhibit freezing.)
    3. In session B: BEGIN: INSERT INTO foo (id) VALUES (2); ROLLBACK;
        By coincidence, this transaction was assigned XID 123456.
    4. In session A: SELECT * FROM foo WHERE id = 2;
        The one-item cache still says that 123456 committed, so we return 
    the tuple inserted by the aborted transaction. Oops.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  72. Re: Range Types - typo + NULL string constructor

    Robert Haas <robertmhaas@gmail.com> — 2011-10-26T16:19:47Z

    On Wed, Oct 26, 2011 at 11:52 AM, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    > On 26.10.2011 18:42, Robert Haas wrote:
    >>
    >> On Tue, Oct 25, 2011 at 12:37 PM, Jeff Davis<pgsql@j-davis.com>  wrote:
    >>>
    >>> Aren't there a few other cases like this floating around the code? I
    >>> know the single-xid cache is potentially vulnerable to xid wraparound
    >>> for the same reason.
    >>
    >> I believe that we're in trouble with XIDs as soon as you have two
    >> active XIDs that are separated by a billion, ...
    >
    > That's not what Jeff is referring to here, though (correct me if I'm wrong).
    > He's talking about the one-item cache in TransactionIdLogFetch(). You don't
    > need need long-running transactions for that to get confused. Specifically,
    > this could happen:
    >
    > 1. In session A: BEGIN; SELECT * FROM foo WHERE id = 1; COMMIT;
    >   The row has xmin = 123456, and it is cached as committed in the one-item
    > cache by TransactionLogFetch.
    > 2. A lot of time passes. Everything is frozen, and XID wrap-around happens.
    > (Session A is idle but not in a transaction, so it doesn't inhibit
    > freezing.)
    > 3. In session B: BEGIN: INSERT INTO foo (id) VALUES (2); ROLLBACK;
    >   By coincidence, this transaction was assigned XID 123456.
    > 4. In session A: SELECT * FROM foo WHERE id = 2;
    >   The one-item cache still says that 123456 committed, so we return the
    > tuple inserted by the aborted transaction. Oops.
    
    Oh, hmm.  That sucks.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  73. Re: Range Types - typo + NULL string constructor

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-10-26T16:31:43Z

    Excerpts from Robert Haas's message of mié oct 26 13:19:47 -0300 2011:
    > On Wed, Oct 26, 2011 at 11:52 AM, Heikki Linnakangas
    > <heikki.linnakangas@enterprisedb.com> wrote:
    
    > > 1. In session A: BEGIN; SELECT * FROM foo WHERE id = 1; COMMIT;
    > >   The row has xmin = 123456, and it is cached as committed in the one-item
    > > cache by TransactionLogFetch.
    > > 2. A lot of time passes. Everything is frozen, and XID wrap-around happens.
    > > (Session A is idle but not in a transaction, so it doesn't inhibit
    > > freezing.)
    > > 3. In session B: BEGIN: INSERT INTO foo (id) VALUES (2); ROLLBACK;
    > >   By coincidence, this transaction was assigned XID 123456.
    > > 4. In session A: SELECT * FROM foo WHERE id = 2;
    > >   The one-item cache still says that 123456 committed, so we return the
    > > tuple inserted by the aborted transaction. Oops.
    > 
    > Oh, hmm.  That sucks.
    
    Can this be solved by simple application of the Xid epoch?
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  74. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-26T16:47:05Z

    On Wed, 2011-10-26 at 12:19 -0400, Robert Haas wrote:
    > > 1. In session A: BEGIN; SELECT * FROM foo WHERE id = 1; COMMIT;
    > >   The row has xmin = 123456, and it is cached as committed in the one-item
    > > cache by TransactionLogFetch.
    > > 2. A lot of time passes. Everything is frozen, and XID wrap-around happens.
    > > (Session A is idle but not in a transaction, so it doesn't inhibit
    > > freezing.)
    > > 3. In session B: BEGIN: INSERT INTO foo (id) VALUES (2); ROLLBACK;
    > >   By coincidence, this transaction was assigned XID 123456.
    > > 4. In session A: SELECT * FROM foo WHERE id = 2;
    > >   The one-item cache still says that 123456 committed, so we return the
    > > tuple inserted by the aborted transaction. Oops.
    
    Yes, that's the scenario I was talking about.
    
    > Oh, hmm.  That sucks.
    
    It didn't seem like a major concern a while ago:
    http://archives.postgresql.org/message-id/28107.1291264452@sss.pgh.pa.us
    
    Regards,
    	Jeff Davis
    
    
    
  75. Re: Range Types - typo + NULL string constructor

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-10-26T17:27:03Z

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
    > That's not what Jeff is referring to here, though (correct me if I'm 
    > wrong). He's talking about the one-item cache in 
    > TransactionIdLogFetch(). You don't need need long-running transactions 
    > for that to get confused. Specifically, this could happen:
    
    > 1. In session A: BEGIN; SELECT * FROM foo WHERE id = 1; COMMIT;
    >     The row has xmin = 123456, and it is cached as committed in the 
    > one-item cache by TransactionLogFetch.
    > 2. A lot of time passes. Everything is frozen, and XID wrap-around 
    > happens. (Session A is idle but not in a transaction, so it doesn't 
    > inhibit freezing.)
    > 3. In session B: BEGIN: INSERT INTO foo (id) VALUES (2); ROLLBACK;
    >     By coincidence, this transaction was assigned XID 123456.
    > 4. In session A: SELECT * FROM foo WHERE id = 2;
    >     The one-item cache still says that 123456 committed, so we return 
    > the tuple inserted by the aborted transaction. Oops.
    
    I think this is probably a red herring, because it's impossible for a
    session to remain totally idle for that long --- see sinval updating.
    (If you wanted to be really sure, we could have sinval reset clear
    the TransactionLogFetch cache, but I doubt it's worth the trouble.)
    
    			regards, tom lane
    
    
  76. Re: Range Types - typo + NULL string constructor

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-10-26T17:28:52Z

    Robert Haas <robertmhaas@gmail.com> writes:
    > I believe that we're in trouble with XIDs as soon as you have two
    > active XIDs that are separated by a billion, because then you could
    > have a situation where some people think a given XID is in the future
    > and others think it's in the past.  I have been wondering if we should
    > have some sort of active guard against that scenario; I don't think we
    > do at present.
    
    Sure we do.  It's covered by the XID wraparound prevention logic.
    
    			regards, tom lane
    
    
  77. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-10-31T18:01:44Z

    On Mon, 2011-10-24 at 13:15 +0300, Heikki Linnakangas wrote:
    > >> * range_constructor_internal - I think it would be better to move logic
    > >> to figure out the the arguments into the callers.
    > >
    > > Done.
    > 
    > The comment above range_constructor0() is now outdated.
    
    Removed.
    
    > >> * The gist support functions frequently call range_deserialize(), which
    > >> does catalog lookups. Isn't that horrendously expensive?
    > >
    > > Yes, it was. I have introduced a cached structure that avoids syscache
    > > lookups when it's the same range as the last lookup (the common case).
    > 
    > Hmm, I don't think that's safe. After Oid wraparound, a range type oid 
    > might get reused for some other range type, and the cache would return 
    > stale values. Extremely unlikely to happen by accident, but could be 
    > exploited by an attacker.
    
    Done.
    
    > Ok. The name "canonical" certainly hints at that, but it would be good 
    > to explicitly state that guideline. As the text stands, it would seem 
    > that a canonical function that maps "[1,7]" to "[1,8)", and also vice 
    > versa, "[1,8)" to "[1,7]", would be valid. That would be pretty silly, 
    > but it would be good to say something like "The canonical output for two 
    > values that are equal, like [1,7] and [1,8), must be equal. It doesn't 
    > matter which representation you choose to be the canonical one, as long 
    > as two equal values with different formattings are always mapped to the 
    > same value with same formatting"
    
    Used your wording, thank you.
    
    Regards,
    	Jeff Davis
    
  78. Re: Range Types - typo + NULL string constructor

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-11-03T08:37:00Z

    On 17.10.2011 01:09, Jeff Davis wrote:
    > On Sat, 2011-10-15 at 01:46 +0300, Heikki Linnakangas wrote:
    >> * Do we really need non_empty(anyrange) ? You can just do "NOT empty(x)"
    >
    > To make it a searchable (via GiST) condition, I need an operator. I
    > could either remove that operator (as it's not amazingly useful), or I
    > could just not document the function but leave the operator there.
    
    Looking at the most recent patch, I don't actually see any GiST support 
    for the empty and non-empty operators (!? and ?). I don't see how those 
    could be accelerated with GiST, anyway; I think if you want to use an 
    index for those operators, you might as well create a partial or 
    functional index on empty(x).
    
    So I'm actually inclined to remove not only the nonempty function, but 
    also the ? and !? operators. They don't seem very useful, and ? and !? 
    don't feel very intuitive to me, anyway. I'll just leave the empty(x) 
    function.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  79. Re: Range Types - typo + NULL string constructor

    Jeff Davis <pgsql@j-davis.com> — 2011-11-03T08:46:09Z

    On Thu, 2011-11-03 at 10:37 +0200, Heikki Linnakangas wrote:
    > Looking at the most recent patch, I don't actually see any GiST support 
    > for the empty and non-empty operators (!? and ?). I don't see how those 
    > could be accelerated with GiST, anyway; I think if you want to use an 
    > index for those operators, you might as well create a partial or 
    > functional index on empty(x).
    > 
    > So I'm actually inclined to remove not only the nonempty function, but 
    > also the ? and !? operators. They don't seem very useful, and ? and !? 
    > don't feel very intuitive to me, anyway. I'll just leave the empty(x) 
    > function.
    
    That's fine with me. At best, they were only marginally useful.
    
    Regards,
    	Jeff Davis