Thread

  1. BUG #18560: Inconsistent Behavior of PostgreSQL 'LIKE' Operator

    PG Bug reporting form <noreply@postgresql.org> — 2024-07-31T04:47:23Z

    The following bug has been logged on the website:
    
    Bug reference:      18560
    Logged by:          Hang ammmkilo
    Email address:      ammmkilo@163.com
    PostgreSQL version: 16.3
    Operating system:   ubuntu 20.04
    Description:        
    
    ```create table
    DROP TABLE IF EXISTS t0;
    CREATE TABLE t0(c0 INT NOT NULL, c1 CHAR(1) UNIQUE);
    INSERT INTO t0 VALUES (0, '1');
    INSERT INTO t0 VALUES (0, '');
    ```
    
    ---sql1
    select ('' like ''),('1' like '1');
    
     ?column? | ?column? 
    ----------+----------
     t        | t
    (1 row)
    
    ---sql2
    select c1,(c1 like c1) from t0;
    
     c1 | ?column? 
    ----+----------
     1  | t
        | f
    (2 rows)
    
    In sql1, the result of the (" like ") query returns true. In sql2 (c1 like
    c1) the return value is false when c1 is ". 
    I think this is contradictory and there may be some bug between the 'like'
    and the query column.
    
    
      
    ---select version():
      
                                                           version              
                                            
    ---------------------------------------------------------------------------------------------------------------------
     PostgreSQL 16.3 (Debian 16.3-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled
    by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
    (1 row)
    
    
  2. Re: BUG #18560: Inconsistent Behavior of PostgreSQL 'LIKE' Operator

    Aleksander Alekseev <aleksander@timescale.com> — 2024-07-31T09:19:59Z

    Hi,
    
    > In sql1, the result of the (" like ") query returns true. In sql2 (c1 like
    > c1) the return value is false when c1 is ".
    > I think this is contradictory and there may be some bug between the 'like'
    > and the query column.
    
    Actually the result is consistent:
    
    ```
    SELECT c1, c1 = ' ' FROM t0;
     c1 | ?column?
    ----+----------
     1  | f
        | t
    (2 rows)
    
    select ' '::char(1) like ' '::char(1);
     ?column?
    ----------
     f
    (1 row)
    ```
    
    Although I find it puzzling too that LIKE operators for CHAR(1) and
    TEXT work differently. Not 100% sure if this is intended.
    
    Thoughts?
    
    -- 
    Best regards,
    Aleksander Alekseev
    
    
    
    
  3. Re: BUG #18560: Inconsistent Behavior of PostgreSQL 'LIKE' Operator

    hubert depesz lubaczewski <depesz@depesz.com> — 2024-07-31T09:46:46Z

    On Wed, Jul 31, 2024 at 04:47:23AM +0000, PG Bug reporting form wrote:
    > CREATE TABLE t0(c0 INT NOT NULL, c1 CHAR(1) UNIQUE);
    
    I bet the thing is simply one more edge case that proves that one
    shouldn't be using char(n) datatype:
    https://wiki.postgresql.org/wiki/Don't_Do_This#Don.27t_use_char.28n.29
    
    Best regards,
    
    depesz
    
    
    
    
    
  4. Re: BUG #18560: Inconsistent Behavior of PostgreSQL 'LIKE' Operator

    Junwang Zhao <zhjwpku@gmail.com> — 2024-07-31T13:17:38Z

    On Wed, Jul 31, 2024 at 5:20 PM Aleksander Alekseev
    <aleksander@timescale.com> wrote:
    >
    > Hi,
    >
    > > In sql1, the result of the (" like ") query returns true. In sql2 (c1 like
    > > c1) the return value is false when c1 is ".
    > > I think this is contradictory and there may be some bug between the 'like'
    > > and the query column.
    >
    > Actually the result is consistent:
    >
    > ```
    > SELECT c1, c1 = ' ' FROM t0;
    >  c1 | ?column?
    > ----+----------
    >  1  | f
    >     | t
    > (2 rows)
    >
    > select ' '::char(1) like ' '::char(1);
    >  ?column?
    > ----------
    >  f
    > (1 row)
    > ```
    >
    > Although I find it puzzling too that LIKE operators for CHAR(1) and
    > TEXT work differently. Not 100% sure if this is intended.
    >
    > Thoughts?
    
    The reason for *select ' '::char(1) like ' '::char(1)* returns false is
    that the pattern got trimmed(see rtrim1) to length 0, and the first ' '::char(1)
    has length 1, so it doesn't match.
    
    But for *select ' '::text like ' '::text*, the pattern is not trimmed.
    
    The rtrim1 function is in a file named oracle_compat.c, can anybody
    verify how oracle behaves in this case?
    
    
    >
    > --
    > Best regards,
    > Aleksander Alekseev
    >
    >
    
    
    -- 
    Regards
    Junwang Zhao
    
    
    
    
  5. Re: BUG #18560: Inconsistent Behavior of PostgreSQL 'LIKE' Operator

    Tom Lane <tgl@sss.pgh.pa.us> — 2024-07-31T14:35:06Z

    hubert depesz lubaczewski <depesz@depesz.com> writes:
    > On Wed, Jul 31, 2024 at 04:47:23AM +0000, PG Bug reporting form wrote:
    >> CREATE TABLE t0(c0 INT NOT NULL, c1 CHAR(1) UNIQUE);
    
    > I bet the thing is simply one more edge case that proves that one
    > shouldn't be using char(n) datatype:
    > https://wiki.postgresql.org/wiki/Don't_Do_This#Don.27t_use_char.28n.29
    
    Yeah, exactly.  Some surprising choices were made way-back-when about
    which char(N) operations ought to ignore trailing blanks and which
    shouldn't.  In the case at hand, a closer look shows the problem:
    
    =# explain verbose select c1,(c1 like c1) from t0;
                             QUERY PLAN                          
    -------------------------------------------------------------
     Seq Scan on public.t0  (cost=0.00..43.00 rows=2200 width=6)
       Output: c1, (c1 ~~ (c1)::text)
    
    There is a "char ~~ text" operator but no "char ~~ char" operator,
    so the parser coerces the right-hand "c1" to text --- which causes
    stripping of its trailing blank --- and then applies ~~, which
    treats the trailing blank in its left-hand argument as significant.
    
    Yes, this is arguably inconsistent, but so are a lot of other char(N)
    behaviors.  Proposals to change anything about it have generally
    failed on the grounds that (1) it's not very clear what would work
    better and not just move the surprises around; (2) we risk breaking
    applications that are expecting the current behaviors; and
    (3) char(N) is a deprecated backwater that we shouldn't be putting
    any effort into.
    
    			regards, tom lane