Re: [HACKERS] Postgres' lexer

Leon <leon@udmnet.ru>

From: Leon <leon@udmnet.ru>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Thomas Lockhart <lockhart@alumni.caltech.edu>, pgsql-hackers@postgresql.org
Date: 1999-09-02T12:25:11Z
Lists: pgsql-hackers

Attachments

Tom Lane wrote:


> To my mind, without spaces this construction *is* ambiguous, and frankly
> I'd have expected the second interpretation ('+-' is a single operator
> name).  Almost every computer language in the world uses "greedy"
> tokenization where the next token is the longest series of characters
> that can validly be a token.  I don't regard the above behavior as
> predictable, natural, nor obvious.  In fact, I'd say it's a bug that
> "3+-2" and "3+-x" are not lexed in the same way.
> 

Completely agree with that. This differentiating behavior looks like a bug.

> However, aside from arguing about whether the current behavior is good
> or bad, these examples seem to indicate that it doesn't take an infinite
> amount of lookahead to reproduce the behavior.  It looks to me like we
> could preserve the current behavior by parsing a '-' as a separate token
> if it *immediately* precedes a digit, and otherwise allowing it to be
> folded into the preceding operator.  That could presumably be done
> without VLTC.

Ok. If we *have* to preserve old weird behavior, here is the patch.
It is to be applied over all my other patches. Though if I were to
decide whether to restore old behavior, I wouldn't do it. Because it
is inconsistency in grammar, i.e. a bug.

-- 
Leon.