Thread

  1. Re: [HACKERS] Postgres' lexer

    Leon <leon@udmnet.ru> — 1999-09-02T12:25:11Z

    Tom Lane wrote:
    
    
    > To my mind, without spaces this construction *is* ambiguous, and frankly
    > I'd have expected the second interpretation ('+-' is a single operator
    > name).  Almost every computer language in the world uses "greedy"
    > tokenization where the next token is the longest series of characters
    > that can validly be a token.  I don't regard the above behavior as
    > predictable, natural, nor obvious.  In fact, I'd say it's a bug that
    > "3+-2" and "3+-x" are not lexed in the same way.
    > 
    
    Completely agree with that. This differentiating behavior looks like a bug.
    
    > However, aside from arguing about whether the current behavior is good
    > or bad, these examples seem to indicate that it doesn't take an infinite
    > amount of lookahead to reproduce the behavior.  It looks to me like we
    > could preserve the current behavior by parsing a '-' as a separate token
    > if it *immediately* precedes a digit, and otherwise allowing it to be
    > folded into the preceding operator.  That could presumably be done
    > without VLTC.
    
    Ok. If we *have* to preserve old weird behavior, here is the patch.
    It is to be applied over all my other patches. Though if I were to
    decide whether to restore old behavior, I wouldn't do it. Because it
    is inconsistency in grammar, i.e. a bug.
    
    -- 
    Leon.