Re: Should we optimize the `ORDER BY random() LIMIT x` case?

Nico Williams <nico@cryptonector.com>

From: Nico Williams <nico@cryptonector.com>
To: Aleksander Alekseev <aleksander@timescale.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Tom Lane <tgl@sss.pgh.pa.us>, Andrei Lepikhov <lepihov@gmail.com>, wenhui qiu <qiuwenhuifx@gmail.com>, Vik Fearing <vik@postgresfriends.org>
Date: 2025-05-19T20:53:35Z
Lists: pgsql-hackers
On Mon, May 19, 2025 at 10:38:19AM -0500, Nico Williams wrote:
> On Mon, May 19, 2025 at 01:25:00PM +0300, Aleksander Alekseev wrote:
> > I agree this would be most convenient for the user. Unfortunately this
> > will require us to check every SELECT query: "oh, isn't it by any
> > chance ORDER BY random() LIMIT x?". I don't think we can't afford such
> > a performance degradation, even a small one, for an arguably rare
> > case.
> 
> Can the detection of such queries be done by the yacc/bison parser
> grammar?

Maybe the `sortby` rule could check if the expression is `random()`,
then `sort_clause` could check if `$3` is a one-item `sortby_list` of
just `random()` and mark `$$` as special -- this should be cheap, yes?
We'd still need to check for `LIMIT` somewhere else.